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Preface 



This volume contains the proceedings of the 10*^ ISAAC conference (Tenth An- 
nual International Symposium on Algorithms And Computation) held in Chen- 
nai, India. This year’s conference attracted 71 submissions from as many as 17 
different countries. Each submission was reviewed by at least three independent 
referees. After a week-long e-mail discussion, the program committee agreed to 
include 40 papers in the conference program. The high acceptance rate is clearly 
an indication of the quality of the papers received. We thank the program com- 
mittee members and the reviewers for their sincere efforts. 

We were fortunate to have three invited speakers this year, providing for a 
very attractive program: Kurt Mehihorn (MPI, Saarbriicken, Germany), Eva Tar- 
dos (Cornell University, U.S.A.), and Kokichi Sugihara (Univ. of Tokyo, Japan). 
Moreover, the conference was preceded by a tutorial on a cutting-edge area, 
Web Algorithmics by Monika Henzinger (Compaq Systems Research Center, Palo 
Alto, U.S.A.) as a joint event with FST&TCS 99 (Foundations of Software Tech- 
nology and Theoretical Computer Science, December 13-15, 1999, Chennai). As a 
post conference event, a two-day workshop on Approximate Algorithms by R.Ravi 
(CMU, U.S.A.) and Naveen Garg (NT, Delhi) was organized. We thank all the invi- 
ted speakers and special event speakers for agreeing to participate in ISAAC’99. 

This year is a special year for the organizing institute Indian Institute of 
Technology, Madras and for the Computer Science and Engineering department 
at IIT, Madras; IIT is celebrating its 40*^ anniversary and it is the Silver Ju- 
bilee year for the CSE department. The CSE department at IIT is proud to 
host this prestigious conference and the allied events. We thank the Director 
of IIT, Madras, Prof. R. Natarajan, for making available the facilities of the 
institute for ISAAC related events. Several private companies and educational 
trusts came forward to extend financial support for ISAAC’99. The generous 
support from IBM Solutions Research Centre, New Delhi, Jeppiar Educational 
Trust, Chennai, Jaya Educational Trust, Chennai, Philips Software Centre Pvt. 
Ltd, Bangalore, Digital and Analog Computing Services (DACS), Bangalore are 
greatfully acknowledged. The IIT Madras Alumni Association in North America 
has contributed towards subsidized stay and travel expenses for several Indian 
participants, enabling them to attend ISAAC’99. We thank the members of or- 
ganizing committee for making it happen. Special thanks go to the staff of our 
Institute and to Alfred Hoffman of Springer- Verlag for help with the proceedings. 
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The Engineering of Some Bipartite Matching 

Programs 



Kurt Mehlhorn 

Max-Planck-Institut fur Informatik, 

Im Stadtwald, 

66123 Saarbrucken, Germany, 
http : //www.mpi-sb .mpg . de/ "mehlhorn 

Over the past years my research group was involved in the development of 
three algorithm libraries: 

— LEDA, a library of efficient data types and algorithms | LED| 

— GOAL, a computational geometry algorithms library |CGA| . and 

— AGD, a library for automatic graph drawing [AGDj . 

In this talk I will discuss some of the lessons learned from this work. I will do 
so on the basis of the LEDA-implementations of bipartite cardinality matching 
algorithms. The talk is based on Section 7.6, pages 360-392, of [MN99j . In this 
book Stefan Naher and I give a comprehensive treatment of the LEDA system 
and its use. We treat the architecture of the system, we discuss the functionality 
of the data types and algorithms available in the system, we discuss the imple- 
mentation of many modules of the system, and we give many examples for the 
use of LEDA. 

My personal level of involvement was very different in the three projects: 
The LEDA project was started in 89 by Stefan Naher and myself and I have 
been involved as a designer and system architect, implementer of algorithms 
and tools, writer of documentation and tutorials, and user of the system. For 
GGAL I acted as an advisor and for AGD my involvement was marginal. 

The bipartite cardinality matching problem asks for the computation of a 
maximum cardinality matching M in a bipartite graph {A(j B , E). A matching 
in a graph G is a subset M of the edges of G such that no two share an endpoint . 

I will discuss the following points: 

— Specification: We discuss several specifications of the problem and discuss 
their relative merits, in particular, with respect to verification and flexibility. 

— Ghecking and verification: We discuss how a matching algorithm can justify 
its answers and how answers can be checked. 

— Representations of matchings: We discuss how matchings can be represented 
and what the relative merits of the representations are. 

— Reinitialization in iterative algorithms: Most matching algorithms work in 
phases. We discuss how to reinitialize data structures in a cost-effective way. 

— Search for augmenting paths by depth-first search or by breadth-first search: 
We discuss the relative merits of the two methods. 
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Table 1. Running times of our matching algorithms. The first columns show the values 
of n/10"^ and m/lO"*, respectively. The meaning of the other columns is explained in 
the text. A dash indicates that the program was not run on the instance. 

— The use of heuristics to find an initial solution: Matching algorithms can 
either start from an empty matching or can use a heuristic to construct an 
initial matching. 

— Simultaneous search for augmenting paths: The fastest matching algorithms 
search for augmenting paths in order of increasing length. 

— Documentation: We discuss the merits of literate programming for documen- 
tation and why we use it document our implementations. 

Figure [U shows the running times of our bipartite matching algorithms; the 
source code of all our implementations can be found in [MN99j . A plus sign 
indicates the use of the greedy heuristic for finding an initial matching and a 
minus sign indicates that the algorithm starts with the empty matching. The 
algorithms HK [HK73| and AB |ABMP9Tj have a worst case running time of 
0{^/nm) and the other algorithms have a worst case running time of 0{nm). 
FFB stands for the basic version of the Ford and Fulkerson algorithm [FF63| . 
It runs in n phases, uses depth-first-search for finding augmenting paths and 
uses 0{n) time at the beginning of each phase for initialization. Its best case 
running time is 0{n^). The algorithms dfs and bfs are variants of the Ford and 
Fulkerson algorithm. They avoid the costly initialization at the beginning of each 
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phase and use depth-first and breadth-first search, respectively. The algorithms 
HK and AB use breadth-first search and search for augmenting paths in order 
of increasing length. The last column shows the time to check the result of the 
computation. 

We used bipartite group graphs Gn,m,k, as suggested by |CGM~*~9^ in their 
experimental study of bipartite matching algorithms, for our experiments. A 
graph Gn,m,k has n nodes on each side. On each side the nodes are divided into 
k groups of size n/k each (this assumes that k divides n). Each node in A has 
degree d = m/n and the edges out of a node in group i of A go to random nodes 
in groups i + 1 and i — 1 oi B. 

The running times our algorithms differ widely. We observe (the book at- 
tempts to explain the observations, but we will not do so here) that the program 
with the quadratic best case running time is much slower than the other im- 
plementations, dfs is almost always slower than bfs and frequently much slower, 
that the use of the heuristic helps and the advantage is more prominent for the 
slower algorithms, and that the asymptotically better algorithms are never much 
slower than the asymptotically slower algorithms and sometimes much better. 
We also see that the time for checking the result of the computation is negligable. 

Table □ is a strong case for algorithm engineering and its interplay with 
the theoretical investigation of algorithms. We have algorithms with the same 
asymptotic bounds and widely differing observed behavior. The differences can 
be explained, sometimes analytically and sometimes heuristically, coined into 
implementation principles, and applied to other algorithms. See Sections 7.7 on 
maximum cardinality matching in general graphs, 7.8 on weighted matchings in 
bipartite graphs, and 7.9 on weighted matchings in general graphs of IMJN99I to 
see how we applied the lessons learned from bipartite cardinality matchings to 
other matching problems. 
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Abstract. For storage and retrieval applications where access frequen- 
cies are biased, uniformly-balanced search trees may be suboptimal. 
Splay trees address this issue, providing a means for searching which 
is statically optimum and conjectured to be dynamically optimum. Sub- 
ramanian explored the reasons for their success, expressing local transfor- 
mations as templates and giving sufficient criteria for a template family 
to exhibit amortized O(logA) performance. We present a different for- 
mulation of the potential function, based on progress factors along edges. 
Its decomposition w.r.t. a template enables us to relax all of Subrama- 
nian’s conditions. Moreover it illustrates the reasons why template-based 
self-adjustment schemes work, and provides a straightforward way of eva- 
luating the efficiency of such schemes. 



1 Introduction 

The directory problem, that of handling an on-line series of INSERT, DELETE 
and FIND requests, is for most applications addressed satisfactorily by balanced 
search trees (BST) (see e.g., j9j for a trace of BST’s from |2] onwards), which 
guarantee 0(log N) worst time for each of these operations. For applications ho- 
wever in which the access frequencies are highly biased the “overbalancedness” of 
these trees yields suboptimal performance. This was addressed first, statistically, 
in the 1970’s with biased search trees (see i, ig, [II), and later dynamically in 
the 1980’s with splay trees (see m)- Splay trees achieve a logarithmic amorti- 
zed cost for searching and other operations, and are competitive w.r.t. any other 
static binary search tree on the same set of elements (see also |15|1. 

However US left some important issues open, among which the most intri- 
guing is the dynamic optimality conjecture: Are splay trees competitive even 
when compared to dynamically maintained search trees? To the best of the aut- 
hors’ knowledge this issue remains open, so we still do not know which is the 
best, at least in this sense, solution to the directory problem. Most probably 
this can be attributed to the lack of a general self-adjustment theory. In the 
I990’s Subramanian ([13]) made a step towards this by describing splaying as a 
set of rules, called templates. Templates are comprised of two trees, the before- 

* This work was supported in part by grant TACIT 312304 at the Foundation of 
Reasearch and Technology, Hellas (FORTH). 
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and after- schema, each with a special current node. Templates specify a splay- 
step in which the before-schema is replaced by the after-schema; this is repeated 
continuously upwards along a path, until the current node reaches the root. 
Subramanian proved that for such rules to have logarithmic amortized cost, the 
following conditions are sufficient for binary search trees: (1) “strict growth”: 
the set of nodes below the current node is strictly augmented at each stage; 
(2) “depth reduction”: the (two) subtrees of the current node are linked strictly 
nearer the root; (3) “progress”: descendants of template nodes must be moved 
below the current node within a constant number of steps. 

In this paper we adopt the template framework of [13j and employ a potential 
function analysis (i.e., we define for a tree T a potential function and for 

each operation q which transforms T to T' we bound the change in potential 
<P{T) —<P{T') > —a{q)-\-c{q) where c(g) is the actual and a{q) the amortized cost 
of q). Thereafter we pursue a different line of analysis, succeeding in extending 
substantially [13]: (a) we define the progress factor along a path (edge) u — >■ u 
as log{w{u) /w{v)) where w{x) is the weight of the subtree rooted at node x; (b) 
we introduce the multiplicity p{e) of a schema-edge e, defined as the number 
of paths in the schema which pass through it, minus one; (c) we define the 
potential function <P{T), somewhat differently than in [TT|, as the sum of all 
progress factors along edges of T. 

All these enable us to express th, when applying a template, as three partial 
sums: (I) over non-schema edges, (II) over the paths of the schemata applied, 
and (III) a compensating addend for (II), summing the progress factors of each 
schema edge e, /i(e) times. We show that, for before-schemata which are paths, 
part III of can be summed telescopically to 0(log|T|). To achieve log- 

arithmic amortized cost it suffices to guarantee further that part III of <P{T) 
is f2(l). We show that if an after-shema contains a size-d branching, all ed- 
ges of which have multiplicity at least p, part III is il{pdlogd) (= 12(1) for 
p > 0). Even when a branching does not appear explicitly, we prove that many 
templates — quite easily recognizable by our method — offer sufficient gain be- 
cause a branching appears in an implicit intermediate splay-step. Moreover, 
since multiplicities are easily visualizable, we obtain a handy calculus to assess 
the efhency of any splay rule-set. 

Thus all conditions of m are relaxed: logarithmic amortized-cost splaying 
can be achieved in any tree, while neither “strict growth” nor “depth reduction” 
nor any non-trivial “progress” need to be enforced: mere branching (explicit 
or implicit) is what makes splay efficient. By our work the vast majority of 
template-based self-adjustment schemes turn out to have, quite unexpectedly, 
logarithmic amortized cost! 

In sections 2 and 3 we give basic definitions and some metrics for templates. 
In sections 4 and 5 we present our calculus for estimating the cost of a splay 
operation and prove sufficient conditions for logarithmic amortized cost “splay- 
ing”. In section 6 we give characteristic applications. Section 7 is an epilogue 
with further open issues towards a self-adjustment theory. 
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2 Templates: Schemata, Rules and Rule-Sets 

We assume the reader to be familiar with the usual tree terminology: root, node, 
edge, children, parent, sibling, path, degree. We shall denote the node-set of a 
tree T with V or V{T) and the edge-set with E or E{T). Our trees will have four 
characteristics: (1) they will be rooted: we shall imagine edges as oriented from 
the root towards the leaves; (2) children will be ordered, pictorially from left to 
right; (3) they will have a single distinguished node, cursor(T); and (4) they will 
have no nodes of degree one. The node-set V (T) will consist of internal(T) nodes 
(those with degree > 2), and external(T) (those with degree 0). Edges between 
internal nodes will be called internal edges and their set will be denoted by I (T) . 

Expanding on m, a schema is a cursored tree p, where cursor(p) is one of 
its internal nodes. Given a cursored tree T and a schema p, a matching of p 
into T, / : p — >• T, is a mapping which “preserves” the cursor, the edges, the 
degrees and the order of the children. (So / is a tree endomorphism for trees as 
described above.) If a matching exists it will be unique and we shall say that p 
matches into T and denote the matching by p — >■ T. When referring to a schema 
p matched into a tree T, we shall denote the image of a node, an edge, a set 
of edges, the cursor, etc., of p in T by appending only the “— >■ T” part. E.g., 
external(p) — >■ T is the set (p — >■ r)(external(p)). We do the same for the inverse 
image of a node, edge, etc., of T back to p: e.g., for e G T, e — >■ p is the inverse 
image (p — >■ T)“^(e) (when defined). 

A template-rule cr is a triple a = (cr~, cr'*’, cr*) where a~ and are schemata 
with the same number of external nodes (but not necessarily the same number of 
internal nodes), and a* is a bijection external(cr“) — > external((T+), determining 
the rearrangement of these leaves. In the case that we are dealing with search 
trees, a* simply follows the left-to-right ordering of the external nodes of cr” and 

. Our approach however allows us to address the general case in which nodes 
may be ordered by other criteria. 

We shall call a~ the “before” -schema and the “after”-schema. A rule cr 
is applicable in T if and only if a~ matches in T. In this case we define the 
application of a to T as the (unique) tree tI'^ 1 with node-set V{T) — {I{a~) — >■ 
T) U such that the edges outside t T remain unaffected, cr+ 

matches in and the nodes external(cr“) — >• T correspond to external(cr+) — >■ 
according to a*. Although tI°' 1 is a re- linking of parts of T with cr'*', we 
shall imagine it as a new “copy”. (See figure 1. Cursor positions are indicated by 
arrows, and the external-nodes’ mapping cr* by uniquely labelling the respective 
nodes.) 

Template-rules are intended to be applied iteratively along the path from 
some initial node to some other final node (usually the root of our tree), selecting 
at each stage a matching template-rule. For this purpose we shall need a whole 
set of rules. It is assumed that we possess a complete rule-set, i.e., a set S of 
rules sufficiently rich in order that given any cursor position (other than the final 
one) within any cursored tree T there exists in S at least one applicable rule 
for T. A rule-set may be given either explicitly or implicitly. Given a complete 
rule-set S and an initial tree T with an initial position for cursor(T), we get a 
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Fig. 1. A template rule: “before” and “after” schema (shaded nodes are internal) 



sequence of cursored trees, 



splay step 



T = To ^Ti^T2 



rri rr\ 

J-k-l >• -Lk 



Tl = S{T) 



splay operation 



where each Tk is obtained by applying an applicable rule from S, until cursor(T) 
reaches its final position (e.g., the root of T). Following the traditional — by 
now — terminology, we shall call Tk-i — >■ a splay-step, and the whole transfor- 
mation To Tl a splay- operation. 

At this level of generality some splay-steps may be, in some sense, ineffective. 
Consider two rules, a a and if matches in (t+ and is applied immediately 

after CTq, all relinking due to ap takes place within In this case a a and <jp 

combine to form one rule, of size equal to that of a a. Such “combined” rules 
should be already present in S and selected immediately for application. Since 
certainly we should not pay for just “splaying around”, we shall assume that S is 
not only complete but also effective, i.e., for any tree T and any rule CTq matched 
in T, there exists some rule ap which matches in tI“ 1 (thus it is applicable) but 
which does not match in cr+ (thus it is “effective”). Effective rule-sets should 
be, of course, also applied accordingly: usually there will be a general prescribed 
“rule-selection policy” which will enforce only effective applications of rules. In 
many cases such a policy may be “select the largest applicable rule” . 

3 Splay Histories and the Amortized Cost of Splaying 

We now turn our attention to the computational cost of splaying. We define 
some reasonable metrics for templates, rules and rule-sets. The size of a tree 
T is defined to be the number of external nodes of T. The size of a rule cr = 
((T“, CT+, cr*) is the size of either a~ or cr+ (they are equal). For a rule-set S, 
MaxSize(S') is the maximum size of any rule in S. Implicit in this definition is 
the fact that we are dealing with bounded-height and bounded-degree templates. 

Consider now a schema p and its internal edges, I{p). Let us denote by 
branch(p) the maximum number of internal nodes which are siblings of each 
other. If branch(p) = 1 then all internal edges form a path, in which case we shall 




G.F. Georgakopoulos and D.J.MCClurkin 



denote by bottom(p) the internal node of p farthest from the root. If branch(/9) > 
2 then the edges I{p) form a proper tree, in which case bottom(p) is undefined. 

The branching Bs of a rule-set S is the minimum branch(cr“'') of any any 
“after” -schema cr+ of rules in S. A rule cr is called path-oriented if branch(cr“) = 
1 and cursor(CT“) = bottom(cr“); it is (explicitly) branching if branch (cr'*') > 2. 

We assume that the worst-case cost of a splay step is 0(MaxSize(S')); this 
cost includes: (1) the cost of inspecting T (around cursor(T)) and selecting an 
applicable rule from S] (2) performing the corresponding splay-step; and (3) 
performing any further computation related to whatever information is stored 
in the nodes. A splay (operation) on T can have a cost even l7(size(T)), so we 
will be interested in obtaining an amortized cost analysis. We assume that we 
start from an initial tree and we perform a series of M splay operations, Sk, 
according to rule-set S: 



splay operation 



rp 2 ^( 0 ) ^ ^( 1 ) ^ ^( 2 ) 



rp{k—l) 



rp(M) 



splay history 



Let each splay operation Sk applied to T cost C{sk,Tk-i)- We shall define 
a non-negative function, on trees with weights, bounding the cost in the 
following sense: 



C(sfc,Tfc_i) < l{Tk) + - <?(Tfc)) 

where l{Tk) is a convenient cost-function defined on T^. Since in such an ex- 
pression potential differences cancel telescopically, the positivity of ‘P guarantees 
that C'(sfe, Tfe_i) is less than '^j.l{Tk), and so 1{T) can be taken to be the 
amortized cost of a splay operation. We shall examine — among other related 
issues — under what general conditions this cost is logarithmic as a function of 
size(T). 



4 Our Logistics Scheme: Progress and Edge-Multiplicities 

We assume there to be a weight function, w : V(T) — >■ [l,oo), defined on nodes 
u of T, such that (i) w{u) is super-additive, i.e., if node u has children vi, . . . ,Vd 
then w{u) > (h) w{u) depends only on the set of elements in 

the subtree rooted at u, and not on their structure. The standard example of 
a weight function is w{u) = size(T„), where T„ is the subtree with root u. (We 
assume that w{u) > 1 because we shall only deal with quotients w{u)/w{v).) 
Since along an edge e = {u, v) one essentially passes from a subtree of size w{u) 
to one with size w{v), we define the progress factor along e as 

(j){e) = (j){{u,v)) = \og{w{u)/w{v)). 

Analogously for a path P = (ug, u„), i.e., a set of edges (uk-i,Uk), fc = 1, . . . , n, 
we define the progress factor of P as (f{P) = \og{w{ug)/w{un))- We define as 
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the potential of T, <P{T), the sum of all progress factors: 

^{T)= Y. 0(e) 

eeE{T) 

In the case of full binary trees this corresponds closely to the Sleator-Tarjan po- 
tential function, <Pst{T) = Y1,u^v{t) log(w(u)). In particular <P{T) = <P,st{T) + 
log(w(root(T))). 

For most edges e, 0(e) does not change when applying a rule a, so many 
terms cancel out in the expression for A(p = <1>{T) — ^(tI'^1). To capture these 
cancellations we rewrite the expression for (P(T) in a way that depends on the 
rule we apply. Given a tree p, the multiplicity p{e) of an edge e in p, is the total 
number of paths (root(p),u), for u G external(p) passing through e, minus one 
(we subtract one to simplify expressions throughout): 

For e € p, p{e) = |{(root(p), m) : u G external(p)}| 

We rewrite <P{T) w.r.t. a schema cr as (P{T) = <1>^(T) +(1>'^(T) + (p'i^(T), where 

^a = - Y <^)^(e), K = Y Hi^oot{a) -)> T,u)), 

e^l{a)—^T nGexternal((7)^T 



< = Y. ^(e) 

edE{T)-{E{a)^T) 

The <?"-sum is over “non-schema” edges; the <?(,.-sum is over the paths from 
the “root” of a to its external nodes and the <?cr-sum compensates for this by 
subtracting the progress factor for internal edges the appropriate number of 
times, i.e., their multiplicity w.r.t. a. We write 

A<P = (<?,,- (T) - (tM)) + «-(T) - + iK-iT) - <+(7’'"’))- 

Edges outside a remain unaffected, i.e., <?"_(T) = ^"+(T['^1). Each subtree han- 
ging from an external node, u, of cr“ although repositioned is again hanging from 
an external node of cr+, so the total progress along (root((T“), u) or (root(cr+), u) 
is the same, so (Here we use property (b) of the weight 

function w{u).) We therefore get an exact expression for A^ by ^o--sums only: 



A<p=- Y ^ cr )0(e) -f Y (^(e) cr+)0(e) 



eeI{ir-)^T 



^eG/(cr+)-S.T[‘'l 



“loss” part of potential change 



“gain” part of potential change 



So changes by “losing” progress factors along e G I{cr~) — >■ T and “gaining” 
along e G /(cr"*") — >■ each as many times as their multiplicity. We visualize 

this situation, thinking of each edge e G — >■ T as being labeled by p{e — >■ 

cr“) “minuses” and each e G /(ct+) — >■ by p{e — >■ <t+) “pluses”. See figure 2. 

Two simple and crucial lemmata follow: 
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Fig. 2. Template Rules: “+’ and ‘ — ’ 



Lemma 1 (“Vertical” Lemma): Let P be a path in tree T with weight 
w(-), and wt = w(root(T)). Let c(e) be a constant for each edge e in P, and 
cmax = maxe6p{c(e)}. Then 

c(e)cf)(e) < < Cmax log(wT). 

eGP 

Proof: Easily obtained by a “telescopic” product (sum of logarithms). □ 

Lemma 2 (“Horizontal” Lemma): Let C C children(M) for u a node of tree 
T. Then 

> |C|log|C|. 

vec 

Proof: An easy consequence of the super-additivity of w{-) and the well-known 
fact that for positive numbers Xk, k = 1, . . . ,n, their arithmetic mean is larger 
than or equal to the geometric mean. □ 

The following third lemma expresses the fact that the minimum for lemma 
2 cannot always be achieved. Lemma 3 will not be useful for our main theorem, 
but it will be for an interesting application. 

Lemma 3 ( “Ladder” Lemma) : Let p be a path-oriented schema, with length 
hp and total weight Wp. Then 

^^(e) > max(hplog—,hpj. 
eGP '^p J 

Proof: (A slight modification — to fit our working framework — of the proof in 

H- □ 

If during a splay operation of L splay-steps, the “loss”-part is accumulated 
along a path, and each “gain” -part is along branchings then the total “loss” 
will be less than MaxSize(5')0(log(u'7’)) (by the “vertical” lemma) and the to- 
tal “gain” will be greater than Lfl{l) (by the horizontal lemma). The derived 
relation: 



L17(1) < MaxSize(S')0(log('u;T)) — A<P 



( 1 ) 
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(where > 0) will prove a logarithmic amortized cost for a splay operation. 
We can guarantee such a bound under very mild conditions. 



5 Productive and Progressive Schemata 



To guarantee that all ” will accumulate along a path we consider only path- 
oriented rules. To guarantee that some “+” will appear along at least two edges 
leading to siblings, we consider (temporarily) only explicitly branching rules. 

There is still a third thing we must guarantee: that “minuses” gathered along 
a path do not overlap — at least not in an unrestricted manner. When we apply 
each “next” rule ap, we lose potential along /(cr^) — >■ T, but along this path 
we find “gain” produced by the previous rule, ap. If the new multiplicities for 
e G T, /x(e — >■ cr(^), are less than the old, /i(e — >■ cr+), then previous “gain” 
compensates for next “loss”, and “loss” appears only along path P = {I{(t^) — >■ 
T) — {E{a^) — >■ T). Since I{<Jp — >■ T) is a path, and S is applied effectively, paths 
like P extend continuously upward in a non-overlapping fashion, so the “vertical” 
lemma applies. A simple criterion is sufficient to guarantee that /i(e — >■ cr+ is 
strictly greater than /x(e — >■ cr^ for all edges e in the overlapping region: We 
shall call a rule a progressive if in its after-schema there exists at least one 
internal node lying strictly below cursor (cr’'"). Let us combine all this in the 
following basic theorem: 

Theorem 1: If S is an effective rule-set with path- oriented, explicitly branching 
and progressive rules, then splay operations on T have an amortized cost of 



O 



/MaxSize(5') 
V B s log Bs 



log(wT) . 



Proof: let ak, k = 1, . . . , L, he the sequence of rules applied: Tfe_i T^. We 
get A<P: 






^ M(e ^ cr^ )<^(e) + 






\e£l{cr, 






M(e ^ cr+)(l>{e) 



Within each Tk, k = 1, . . . , L, we define the edge-sets Pk, Qk and Rk based on 
and 

Qk = (.f(CTfc+i) I”' (H^k) Bk = — >■ Tk) — Qk, 

Rk = {I{<^k) Bk) — Qk 

Recalling that is a path, we see that Pk is a path “above” Qk is a path 
inside schema (part of ) and Rk consists of the remaining internal edges 
of cr^ . See figure 3. Since the multiplicities are bounded by MaxSize(S') (in fact 
by MaxSize(S') — 1) we get: 



o-fc+i)^(e) < X! o'fe+i)</'(e) + MaxSize(S')(/)(Pfc) 

eeI{u'^j^)^Tk=QkUPk eGQfc 
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Fig. 3. +/ — canellation along Qj, 



Moreover, by progressiveness we get fi{e — >■ cr^) > /r(e — >■ + 1 for e G Qk, 

and since internal edges have positive multiplicity, we get: 

M(e ^ o-fe )<?i(e) > ^ /r(e ^ ^ ^(e) 

eeI(a-+)^Tk=QkURk QkORk 



The important thing now is that all paths Pk, k = 1, . . . , L, even Pq = — >■ 

To, are disjoint paths of the initial tree T = Tg: they consist of edges not matched 
by any before-schema up to the k-th step (a fact easy either to visualize or 
to prove inductively). Moreover their progress factors in T^ are equal to their 
corresponding progress factors in T, since the weights of their subtrees are left 
untouched up to the k-th step. Summing for A<P, for fc = 1, . . . , T, the partial 
sums over Qk cancel out, and we are left with: 



/L-l 



A<P = <P{To) - <P{Tl) > -MaxSize(S') 

Vfe=0 / k=l \ee/(a+)^Tfc 



part I 



Since branch((Tj!") > Bs, the theorem follows by applying the vertical lemma to 
part I and the horizontal lemma to part II (see equation (I)). □ 

Notice that splay can “pay” also for any other operation with an actual 
cost of 0{L). If splay starts at depth h, then we are permitted to perform 
L = i7(/i/MaxSize(S')) splay-steps. So splay can pay not only for itself, but also 
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for the steps needed to reach the initial cursor-position starting from the root 
(e.g., during a FIND operation). 

Theorem 1 does not address path-to-path rules, i.e., non-branching rules, 
in which both the “before”- and “after”-schemata are paths. However we can 
relax the explicit branching condition: most path-to-path rules are — under mild 
conditions — branching rules in disquise. The next lemma, although not in the 
strongest possible form, suffices to reveal this fact: 

Lemma 4: (a) (see also [13|) Let cr be a path-to-path rule, and let all nodes in 
children(bottom(fT^)) be linked strictly above bottom(cTj!"). Then the gain-part 
of a is 17(1), in an amortized sense (specifically every two splay-steps), (b) Let 
cr be a path-to-path rule, and let plus(cr^) C children(bottom(cr^)) denote those 
external nodes u of cr^ for which the edge (u, bottom(cr^)) G was an 

internal edge of (j^_^ (i.e., carries a “-I-”). If any node u in plus(cr^) is linked 
strictly above bottom(cr^), then the gain-part of a is 17(1) in an amortized sense 
(specifically every two splay-steps). 

Proof: after presenting our ±-calculus in theorem 1, we can prove lemma 4 easily: 
(a) Rules a satisfying the stated condition can be considered to hide a branching 
obtained by an intermediate phase: We factor cr^, into two rules a = {a~ ot*) 
and (3 = ( 7 ^, /3+, /?*), where a~ = a'jf and = cr(!" (see figure 4). The first 
intermediate schema 7 q, consists of three nodes, r as root and u, v as two child- 
ren, where u is cursor( 7 o,). children(bottom(cr^)) are the children of u, and 
children(bottom(fT^)) are the children of v. The remaining external nodes of 
af hang arbitrarily from root r. The second intermediate schema 7^3 consists of 
nodes u and r as internal nodes and its external nodes are those of 7 ^ linked 
to u and r. All this is possible since no node in children(bottom(CT^)) will be a 
child of bottom(cr^). Schema 7 a is explicitly branching, so it has positive gain. 




Fig. 4. A branching in disguise (a) 
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The “loss” in potential associated with passing from 7 ^ to 7^3 (a ” along edge 
(r,u)) is transferred to cr^ by adding a series of ” along the path of cr^ and 
this is why we keep children(bottom((T^)) at their positions: the progress factor 
along the path from root( 7 fe) to their parent then is equal to the progress along 
the path from root(cr^) to them. In doing this we may lose all gain of step k, 
but in total the gain from steps k and k + 1 will be 12 ( 1 ) — whatever these steps 
are. 

(b) The second case is somewhat similar: Charging the path of schema cr^ ano- 
ther “minus” including the edge below cursor(cr^) with a “plus”, we are free to 
assign one “plus” to this edge in the next schema, cr^, thus gaining 17(1) for 
our potential. Again these extra “minuses” may cause no gain at step k, but in 
total the gain from steps k and A: -I- 1 will be 17(1) — whatever these steps are. 
(Alternatively one can observe that such rules produce an explicit branching if 
applied twice.) □ 

Let us refer to path-to-path rules as in the above lemma as implicitly bran- 
ching rules, and let denote the set of “composite” rules obtained from r 
consecutive applications of rules from S. Our final theorem fulfills our promises: 
Theorem 2: Let S' be a rule-set for which, for some r, S^”^ is effective with path- 
oriented, progressive and, either explicitly or implicitly, branching rules. Splay 
operations on tree T with weight wf) will have amortized cost (here B$ takes 
into consideration any implicitly branching schemata): 



O 



/ MaxSize(S) 

V BslogBs 



log(wT) ■ 



Proof: Straightforward, according to the discussion above. 



□ 



6 Examples: Classic Splay Trees, Median-Split Trees and 
Path Balance 

Consider the rule set for classic splay trees. The “zig-zag” case is explicitly 
branching, and the “zig-zig” case is implicitly branching according to lemma 
3(a). The “zig” case guarantees no gain, but if we apply each time the large 
rule, the “zig” case will be applied at most once during the last splay-step. Note 
that our “-I-/— ” calculus gives easily an amortized number of 41og -I- 2 nodes 
visited for classic splay. 

In figure 5 we show a template for a variety of trees of our design — we call 
them “median-split” trees. These trees are B-tree-like because except for the 
root all nodes have degree ranging from S to 26. From the pluses and minuses 
in the picture we obtain immediately that the “loss” part has a constant factor 
of 0{6) and the gain part, also, 0{S) (recall the previous section). Therefore 
during splay we visit an amortized number, h, of nodes, where h £ O(logiV), 
and where the constant is independent of 6. This result compares with that of 
Sherk (see [TU]); but here we have a simpler rule-set, with rules of size 0(6) 
(in m templates were of size 0(5^)), we have variable degrees, and a proof of 
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Fig. 5. Main Rule for “Median-Split” splay trees {a, 13,'y = degrees of nodes, all no 
less than d) 



just a few lines (given theorem 1) instead of many pages. (Moreover we have 
strong reasons to conjecture that even templates like that of m cannot offer a 
0(log,5 N) amortized number of nodes visited.) 

In figure 6 we give a rule-set which guarantees logarithmic amortized cost 
but does not fulfil the criteria of [1^: one of the lowermost subtrees is not 
repositioned nearer to the root. 




Fig. 6. Non-Subramanian Rules Square nodes hang from -|- edges 



Our final application is a simplification of the analysis of the “path-balance” 
heuristic given in [Jj. In figure 7(a) we see the simplest branching template-rule. If 
applied along a “straight” path of length L (figure 7(b)) it halves the path, similar 
to a repeated application of the “zig-zig” rule. Applying this halving log times, 
we obtain the “path-balance” heuristic discussed in na, m- “Minuses” can be 
transferred along the initial path, thus accumulating only < 2 log “minuses” 
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Fig. 7. (a) Rotate-Parent; (b) “Path-Halving”: multiple application of (a); (c) “Ladder 
lemma” 



per edge. By the “ladder” lemma and our analysis we get: 

A<P>- log N log hL + max log 

(The case for a non straight path is treated quite similarly.) The obscure way 
of proving this in [Ij can thus be avoided by our calculus, while an inte- 

resting issue is illuminated: the gain and loss are again separated (a fact which 
is not so clear in S): the gain comes from the after-schema and the loss is attri- 
buted to the before-schema. Moreover our approach suggests a splay-rule better 
than “path-balance” in the sense that it has O(logiV) amortized complexity, 
instead of 0(log A^loglog A^/logloglogfV): apply “path-halving” only as long as 
the length of the path we work upon is over 2 log N . The interesting fact here is 
that the depth of all nodes along the path is reduced to O(logiV) — a result not 
achievable by other template-rules (so far). 



7 Epiloge and Further Open Issues 

We have proved that under very mild and natural conditions “almost all” local 
bounded transformations along a path of a tree are splay operations, i.e., with 
logarithmic amortized cost. Our result extends previous results in all directions, 
and offers a handy calculus for estimating the amortized cost of a candidate set 
of rules for splay. 

Nonetheless, quite a few improvements are possible and many interesting 
issues are left open. Among them we mention the following: We would like to 
extend our theory to templates of unbounded degree and/or unbounded height. 
To prove stronger forms of the lemma concerning implicitly branching rules. To 
formulate a similar theory for top-down splay (if possible). To obtain “gain” not 
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only from after-schemata but from correlating an after-schema with its before- 
schema. To handle non-path-oriented rules. To provide tools for accurate ap- 
praisal of which templates are best — in any possible and reasonable sense. To 
check whether improved, or otherwise special, results can be obtained by other 
potential functions or “edge factors” , say by discretising our progress factors, or 
by height-originated, or even by asymmetric (e.g., by distinguishing “left” from 
“right”) edge factors (see for example [14]). To provide a systematic basis for 
probabilistic analysis of the efficiency of template-rules. And finally to provide 
a basis for proving lower bounds as well. 
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Abstract. A static dictionary is a data structure for storing a subset 
S' of a finite universe U so that membership queries can be answered 
efficiently. We explore space efficient structures to also find the rank of 
an element if found. We first give a representation of a static dictionary 
that takes nlgm + O(lglgm) bits of space and supports membership 
and rank (of an element present in S) queries in constant time, where 
n = 1S| and m = \U\. Using our structure we also give a representation 
of a m-ary cardinal tree with n nodes using nflgm] + 2n + o{n) bits of 
space that supports the tree navigational operations in 0(1) time, when 
m is Pqj. arbitrary m, we give a structure that takes the 

same space and supports all the navigational operations, except finding 
the child labeled i (for any i), in 0(1) time. Finding the child labeled i 
in this structure takes O(lglglgm) time. 



1 Introduction and Motivation 

A static dictionary is a data structure for storing a subset S' of a finite uni- 
verse U so that membership queries can be answered efficiently. This problem 
has been widely studied and various structures have been proposed to support 
membership in constant time I5T7ITO in slightly different models. There are 
many situations where one is interested in finding the rank of an element found 
(say when the elements are marks of an exam, and one is interested in the relative 
place of a mark in the ranking) . Our focus in this paper is on space efficient data 
structures to support the rank operation, which asks for the number of elements 
in the set less than or equal to the given element. Our model of computation 
is an extended RAM machine model that permits constant time arithmetic and 
boolean bitwise operations. 

Another motivation for studying rank operation comes from the recent suc- 
cinct representation of m-ary cardinal trees [S]. A cardinal tree of degree A: is a 
rooted tree in which each node has k positions for an edge to a child. A binary 
tree is a cardinal tree of degree 2. An ordinal tree is a rooted tree of arbitrary 
degree in which the children of each node are ordered. The representation jS] of 
an m-ary cardinal tree essentially has two parts: one part giving the ordinal in- 
formation of the tree using 2n-|-o(n) bits and the other part storing the children 
information of each node in the tree using n Ig m bits where n is the number of 
nodes. To navigate around the tree, in particular, to find the child labeled i of a 
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node, we need to find the rank of the element i in the ordinal part information 
of the node. The representation of Benoit et. al.[^ supports this operation in 
O(lglgm) time. Clearly, to perform this operation in constant time, a structure 
for static dictionary taking nlgm + o(n) bits supporting the rank operation in 
constant time suffices. Though we could achieve this when m is o(2*s"'/*8*8”), 
for the general case we have a structure that supports rank and membership in 
O(lglglgm) time. 

If we want to support the rank operation for every element in the universe, 
there is a lower bound of I7(lglgm) time per query even if the space used is 
polynomial in n |I]. Willard m gave a structure that answers rank (and hence 
membership) queries in O(lglgm) time using 0{nlgm) bits of space. Fiat et. 
al. [Zl gave a structure that answers membership queries in constant time and 
rank queries in 0(lgn) time using nlgm + 0(lglgm + Ign) bits of space. The 
structure given by Pagh m answers membership and rank queries in constant 
time when the size of the set n is w(mlglgm/lgm), using n\g{m/n) + 0{n) 
bits. Our focus here is to support rank queries for only the elements that are 
present in the given set. 

The only structure we know of to support membership and rank for those 
elements found is due to Benoit et. al.[2| en route to their efficient cardinal tree 
representation. Their structure supports membership and rank in 0(lg Ig m) time 
using at most nlgm bits. We give an alternate structure that supports both these 
operations in 0(1) time using nlgm+0(lglgm) — 0(n) bits. As amatter of fact, 
the structure due to Benoit et. al. supports both these operations in constant 
time using at most n Ig m bits as long as n > Ig m whereas our structure supports 
both these operations using the same time and space as long as n > Ig Ig m. In 
the smaller range, both these structures take O(lgn) time if only nlgm bits are 
allowed. 

In Section 2, we give a space efficient static dictionary structure that answers 
membership and rank queries in constant time. This structure builds up on the 
recent enhancement of Pagh[12j of the FKS[S] dictionary and uses n Ig m+0(n+ 
Iglgm) bits. In Section 3, we use an interesting idea to remove the 0(n) term 
in the space complexity of the structure. In this section, we also outline space 
efficient structures to support the select operation (find the j-th smallest element 
in the given set). In Section 4, we outline the m-ary cardinal tree representation 
of Benoit et. al.|^ and explain how our rank dictionary structure can be used to 
improve the running time from O(lglgm) to 0(lg Iglgm) for finding the child 
labeled i, if exists, for any i. We also illustrate another improvement to the 
structure so that all the navigational operations can be supported in constant 
time if m is o( 2 ' 8 "/i 8 ig™)^ 

2 A Rank Structure Taking n Ig m + 0(n + Ig Ig m) Bits 

Fredman et. al.[Sj have given a structure that takes nig m + 0(lg Ig m + n^/lg n) 
bits and supports membership in 0(1) time. Schmidt and Seigel [13] have im- 
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proved this space complexity to nlgm + 0(lglgm + n) bits. We refer to this 
structure as the FKS dictionary in the later sections. 

The original FKS construction to store a set S, as described in has four 
basic steps: 

— A function hk,p{x) is found that maps S into [0, — 1] without collisions. It 

suffices to choose hk,p{x) = {kx mod p) mod n? with suitable k < p < n?lgm 
where p is a prime. Here, the values k and p depend on the set S. 

— Next, a function h^^riz) is found that maps hk,p{S) into [0, n— 1] so that the 

sum of the squares of the collision sizes is not too large. Again, it suffices to 
choose = (nz mod r) mod n, where r is any prime greater than 

and K e [6,r] so that J2o<j<n \K)~U) ^ hk,p{S)\'^ < 3n. 

— For each non-empty bucket i, a secondary hash function hi is found that 
is one-to-one on the collision set. We choose hi{z) = {kiZ mod r) mod Ci^, 
where ki G [0, r — 1] and Ci is the size of the collision set. The element x G S, 

is stored in location Ci + hi{h^^r{hk,p{x))), where Ci = cq^ + c\^-\ hCi_i^. 

This locates all n items within a table of size 3n, say A*[l . . . 3n]. 

— Finally the table is stored without any vacant locations in an array A[l...n] 
in the same order. 



The composite hash function requires the parameters A:, p, k and r for hk^p 
and a table K[0 . . . n] storing the parameters ki for secondary hash functions 
hi, a table C'[0...n] listing the locations Ci and finally a compression table 
D[1 . . . 3n], where D[j] gives the index, within A, of the item (if any) that hashes 
to the value j in A*. Thus this composite hash function description requires 
0(n Ig n -I- Ig Ig m) bits of space. 

Schmidt and Siegel jl3] first observe that up to [IgnJ -I- 1 secondary hash 
functions are sufficient to store the elements of the set. They also show how to 
represent this composite hash function using 0{n + Iglgm) bits of space. We 
briefly describe their representation below. Parameters k and p require 0(lgn-|- 
Iglgm) bits each and k and r take O(lgn) bits each. The parameters for the 
secondary hash functions k\, k 2 , ■ ■ ■ , fc([ig„j+i) are stored in an array, which takes 
0((lgn)^) bits. The table K contains, for its ith sequence of bits, the integer 
ai in unary, if ka is the multiplier (the secondary hash key) associated to hash 
the bucket i, 0 < i < n. This table is an 0{n) bit string. We store a o{n) 
bit auxiliary structure along with this bit string to support rank and select 
operations |hl5l1 H] on it in constant time. Using this and the array of multipliers, 
given an i, we can find the multiplier associated with the bucket i in constant 
time. The table C, which contains the values Ci (= cq^ -I- ci^ -I- • • • -I- Ci_i^), is 
encoded as follows. First the values cf are stored in a table Tq in unary notation 
(in order of appearance, separated by O’s), which is of length at most 4n. We 
also store an auxiliary structure of o{n) bits to support rank and select on both 
the bits, in constant time. Now, given an i, Ci is nothing but the rank of the Ah 
0 (i.e. Ci = ranki{selectQ{i)) ), which can be found in constant time. For the 
compression table D, we store a bit string of length 3n where the ith bit is a 0 
if A*[i] is empty and 1 otherwise. We also store a o(n) bit auxiliary structure to 
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support rank and select operations on this in constant time. When an element is 
hashed to a location in D, the rank of the bit in that location in the bit vector 
representation of D gives the location of the element in the array A. 

Now, to obtain rank for the element in the set, we could simply store the 
rank with each element in the FKS table. However this takes nlgm + nlgn + 
0{n + Ig Ig m) bits. In the rest of this section, we describe how we can get rid of 
the nlgn term. 

Pagh m has observed that each bucket j of the hash table may be resol- 
ved with respect to the part of the universe hashing to bucket j. Thus we can 
save space by compressing the hash table part (i.e. table A above) of the data 
structure, storing in each location not the element itself, but only a quotient 
information that distinguishes it from the part of U that hashes to this location. 
The quotient function, slightly modified from that of Pagh is as follows: 

qk,p{x) = ((a; div p).\p/r~\ + {k.x mod p) div n^).\r/n~\ + [n.z mod r) div n 

where z = (k.x mod p) mod and the parameters fc, p, k and r are as defined 
in the FKS perfect hash function. It is easy to see that qk^p{x) for x G U is 
0{mjn) (as {k.z mod r) div n < r/n and {k.x mod p) div < p/r). 

Thus the total space to store all the quotient values along with the hash 
function will be nlg{m/n) + 0{n + lglgm) bits. To find an element, we compute 
its quotient value, apply the composite hash function to determine a location 
and check whether the quotient value appears in that location. Now, with each 
element we also store the rank of the element in the set for an extra space of 
n|"lgn] bits. Thus if the element is found, we can get its rank from the rank 
information stored in its location. 

Thus we have 

Theorem 1. A static dictionary for a subset S of size n of a finite universe 
U = {!,..., m} can be constructed using nlgm+ 0{n + lglgm) bits of space so 
that membership and rank queries can be answered in 0(1) time. 

3 A Rank Structure Taking nlgm + O(lglgm) Bits 

In this section we illustrate a method by which the space used by the structure 
in the last section can be reduced by en bits for any parameter c < (1 — e) Ign, 
0 < e < 1. The trick is to store only the last (Ig n — c) bits of the rank instead of 
storing the entire value of the rank along with each element. Suppose the sorted 
list of the elements of the set is divided into 2"^ blocks of size roughly n/2'^ each. 
Then the information stored with each element is precisely its rank within its 
block. In another array, we store the index of the (in/2'^)th element in the sorted 
order of the elements (i.e. the first element of the f-th block), for 1 < i < 2°. 

Given an element, the membership proceeds as in the case of our modified 
FKS strategy (as mentioned in the last section). Once an element is found, the 
block to which it belongs (in the sorted order) can be found by doing a binary 
search (using c steps) on the first elements of each block stored in the separate 
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array. The rank of an element within its block is stored with the element in 
the FKS dictionary. From these two information, we can obtain the rank of the 
element. 

The space required, in addition to the nig m/n + 0(n + Iglg m) bits used to 
store the quotient values and the hash function information, is nig n—cn +2° Ign 
bits. Let the space occupied by the hash function and the auxiliary storage for 
the FKS dictionary be d{n + Ig Ig to) bits. Choose c such that cn > 2° Ig n + dn. 
Then the total space requirement will be nlgm + O(lglgm) — 6>(n) bits. If n is 
C(lglgTO-), we can choose c such that cn > 2°lgn + d{n + Iglg to) so that the 
total space will be nig to — 0{n) bits. 

Note that we actually don’t store the elements in the array locations, but 
store only the quotient value of the element in the location to which it hashes 
to. So we describe below, how given a location, we can actually find the element 
of the set, whose quotient is stored in that location, in constant time. 

For this purpose, we store the values of k~^ and k~^ along with other para- 
meters, which require O(lglgTO-l-lgn) bits of extra space. Now, given a location 
/, let q be the quotient value stored in that location and let x be the actual 
element of the given set that hashes to that location. 

Table D (given in the last section) can be used to find the location /* in 
the virtual array A* in which the element should have been stored, using a 
select operation on the bit representation of D. Now, using the table Tq (i.e. 
the compressed form of table C), we can find the bucket into which the element 
has hashed to, which is nothing but the value of {k.z mod r) mod n. Also 
q mod r/n gives us the value {k.z mod r) div n. From these two values, we can 
find the value of k.z mod r from which, using k~^ we can find 2 ;. Note that 
z is nothing but {k.x mod p) mod n^. Now, {q mod r/n) mod p/r gives the 
value of {k.x mod p) div n^. Using these two values, we can find {k.x mod p) 
from which the value of x mod p can be found using the value of k~^. Again, 
{q mod r/n) div p/r gives the value of x div p. Using these two values, we can 
find the value x. 

Thus we have, 

Theorem 2. There exists a static dictionary for a subset S of size n of a finite 
universe [/ = {!,..., to} that uses nig to -I- O(lglgm) — 0{cn) bits of space and 
answers membership queries in constant time and rank queries in 0{c) time 
where 1 <c< (1 — e)lgn for any positive constant e < 1. 



Corollary 1. There exists a static dictionary for a subset S of size n of a finite 
universe U = to} that uses nlgm + 0{lglgm) — 0{n) bits of space and 

answers membership and rank queries in 0(1) time. 



Corollary 2. There exists a static dictionary for a subset S of size n of a finite 
universe U = {!,..., to} that uses nlgm — 0{n) bits of space and answers 
membership and rank queries in 0(1) time when n = O(lglgm). 
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Note the time-space tradeoff in the main theorem above. In particular, if we 
are willing to support the rank operation in O(lglgn) time, then space comple- 
xity comes down to nlgm-|- O(lglgm) — (9(nlglgn) bits. 



3.1 Static Dictionary Supporting Select 

Suppose we want to support only membership and select operations efficiently. 
To support select, besides the modified FKS dictionary to support membership, 
we can store in an array, the pointer to the ith smallest element of the set, for 
1 < z < n. This requires an additional nlgn bits of space. 

To further reduce space, we again store only the last Ign — c bits (i.e. the 
position of the jth element within a block of size u/2°) for some parameter c (to 
be determined) and in a separate array store the index of the (nz/2'^)th element 
in the sorted order of the elements, for 1 < z < 2°. Given a j, to find the jth 
element, we do the following. Find the last Ig n — c bits of the position of the 
jth element from the first array. Now for each choice of the first c bits, find 
the element stored in the location given by the Ign bits. If that element lies 
between the elements ranked n(j — 1) /2'^ and nj /2‘^ (which can be found using 
the pointers stored in the second array), output that element as the jth element. 
Clearly, there will be a unique choice of the first c bits, which gives the location 
of the j-th smallest element. 

As in the last section, c can be chosen in such a way that cn > 2‘^lgn + dn. 

Thus we get 

Theorem 3. There exists a static dictionary for a subset S of size n of a finite 
universe U = {!,..., m} that uses nlgm + 0(lglg m) — 0{cn) hits of space and 
answers membership queries in constant time and select queries in 0(2°) time 
for any parameter c < Ign. 



Corollary 3. There exists a static dictionary for a subset S of size n of a finite 
universe U = {!,..., m} that uses n\gm + 0{\g\gm) — 0{n) hits of space and 
answers membership and select queries in 0(1) time. When n = J7(lglgm), the 
space used is simply nlgm — 0(n) bits. 

3.2 Static Dictionary Supporting Rank and Select 

When n is a constant, we can support membership, rank and select using nlgm 
bits by storing the elements in a sorted array. Also when n > , 

we can store a bit vector of the subset (m bits) and some auxiliary structures 
{o{m) bits) to support membership, rank and select in constant time |5 13] . 

Fiat et. al.|7] have given a structure to store multi key records where search 
can be performed under any key in constant time. By storing an element and its 
rank as a two key record, using this structure, one can support membership, rank 
and select queries in constant time. This structure takes n Ig mn-|-0(lg Ig m-|-lg n) 
bits of space. 
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Another obvious way to support both rank and select operations is to take 
either of the previous two structures (supporting rank or select) given in the last 
subsections and augment it with an array (of n Ig n) bits to support the other 
operation also in constant time. Thus we can support membership, rank and 
select in constant time using a structure that takes nlg77m+ 0(lglg7n) — 0{n) 
bits of space. 

One can also find the rank by performing a binary search in a structure that 
supports membership and select. Thus we can support membership and select 
in constant time and rank in Ig n time using a structure that takes n Ig to + 
O(lglgTO) — 0{n) bits of space. 

It would be interesting to know whether we can support all these operations 
in constant time using nlgTO + O(lglgTO) + o(n) bits. 

4 Representing m-ary Cardinal Trees 

In this section, we look at the problem of representing an m-ary cardinal tree. In 
this tree, each node has to positions for an edge to a child, some of which can be 
empty. Benoit et. al.jS] have given an optimal representation of a cardinal tree 
that takes nflgm] +2n + o(n) bits and supports all navigational operations in 
constant time, except finding a child labeled i, which takes at most 0(lglgTO.) 
time. This encoding has two parts. The first one uses the succinct encoding of 
ordinal trees [HE! which takes 2n + o(n) bits to store an ordinal tree of n nodes 
that supports all navigational operations (on ordinal trees) in constant time. 
In the second part, the n\\gm\ bits of storage is used to store, for each node, 
d\lgm~\ bits to encode which children are present, where d is the number of 
children at that node. 

In this structure, when the given subset is very sparse (namely when n < 
Igm), they store the elements in sorted order, so that a search for an element 
or finding the rank of it takes O(lgn) time. When n > Igm the universe is split 
into equal sized buckets and the values that fall into each bucket are stored using 
perfect hash functions. By choosing the number of buckets appropriately, one 
can make the space occupied by this structure to be at most nig to bits. 

We observe that by using a static dictionary that supports rank and mem- 
bership in constant time that requires at most nig to bits of space, we can con- 
struct a TO-ary cardinal tree structure that supports all navigational operations 
in constant time. 

If the number of children of a node is less than Iglgm, we store them in 
a sorted array. Membership and rank queries in this array can be answered in 
0(lg Iglgm) time. When n is at least Iglgm, we store it using the structure of 
Corollary 4. 

Thus we have 

Theorem 4. There exists an n\lgm~\ + 2n + o(n) bit representation of m-ary 
cardinal trees on n nodes that supports the operations of finding the parent of a 
node or the size of the subtree rooted at any node in constant time and supports 
finding the child with label j in 0(lg Iglgm) time. 
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When n is we propose an alternate structure. The idea is 

very similar to the one described in mM . We follow the above encoding except 
for vertices whose degree is at most Ig Ig m. We construct a table in which each 
entry represents a set of size at most Iglgm which is stored as an m bit vector. 
Along each entry, we also store an auxiliary structure which takes o(m) bits 
to support rank operation on the m bit characteristic vector in constant time. 
We will have a two level ordering of the table. In the first level, we order the 
sets based on their cardinalities. In the second level, we order sets with the 
same cardinality lexicographically (in the bit vector representation) . Now in the 
cardinal representation, when a node has degree at most Ig Ig m instead of storing 
them in sorted order, we simply keep the position of the set in the second level 
of the table (since we can compute the cardinality of the set, which is the same 
as the degree of the node, from the ordinal information, we obtain the position 
in the first level of the table). 

The space occupied by the table is + o{m)) which is o(n). The 

space used by the index at each small degree node is Ig (™) which is at most 
dlgm bits, where d is the degree of the node. Now for these nodes, to search 
for a child or to find its rank, we first find the subtree size from the ordinal 
information of the node and using the index stored with the node in the cardinal 
part, find the bitmap of the subset, which can be used to search for the element 
or find its rank (using the o(m) auxiliary structure) in constant time. 

Thus we have. 

Theorem 5. There exists an n[lgm] + 2n + o(n) bit representation of m- ary 
cardinal trees on n nodes that supports all the navigational operations in constant 
time when n is J7(m*®*®'"“'"^). 



5 Conclusions and Open Problems 

We have given representations of static dictionaries that support rank (or select) 
and membership queries in constant time using nlgm + 0(lglgm) bits of space. 
This gives us a structure that supports rank, select and membership queries in 
constant time using nlgmn + O(lglgm) bits of space. We also gave a repre- 
sentation of a TO-ary cardinal tree that supports all navigational operations in 
constant time except finding a child with label j which takes at most 0(lg Ig Ig m) 
time using [nlgm] -|- 2n -I- o{n) bits of space. 

Some open problems that arise/remain are: 



— Find the space optimal structures for supporting membership, rank (for ele- 
ments in the set) and/or select queries in constant time. 

— Find a representation of m-ary cardinal trees that takes [nig m~\ -|- 2n-|- o(n) 
bits of space and supports all navigational operations in constant time for 
all values of n. 
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Abstract. We study the online problem of holding a number of idle 
threads on an application server, which we have ready for processing new 
requests. The problem stems from the fact that both creating/deleting 
and holding threads is costly, but future requests and completion times 
are unpredictable. We propose a practical scheme of barely random di- 
screte algorithms with competitive ratio arbitrarily close to e/(e — 1). 



1 Introduction 

A server in a network has to execute a large number of jobs due to requests from 
several clients. Multithreading is an approach to serve many requests simulta- 
neously (either on parallel processors or scheduled by a multitasking operating 
system), such that short requests do not have to wait for completion of other 
time consuming jobs, which would be frustrating or even unacceptable. Each 
arriving job is assigned to some currently idle thread which is responsible for 
completing the job. If no thread is idle then a new thread is created. A busy 
thread that has finished a job becomes idle again. 

The workload of the server, i.e. the number of parallel jobs, can fluctuate 
over time very much, but creating and deleting a thread is a costly operation. 
So we should have enough idle threads ready for future requests. On the other 
hand, running many threads on the spot over long periods would be a waste of 
processor cycles which could be better used by other applications residing on 
the same machine. Hence one has to observe a suitable strategy for deleting idle 
threads, without knowledge of future jobs. 

Note that such an online policy is concerned with the nitm&er of jobs and idle 
threads only. In contrast to online scheduling problems (cf. [Z|), it also makes no 
essential difference whether the completion times of current jobs are known or 
unknown. (The latter assumption is suitable e.g. if the jobs include select queries 
to databases, such that completion time depends on the previously unknown 
size of the answer). An adversary may both send arbitrarily short requests and 
extend completed job immediately by new ones, so the online player cannot 
exploit knowledge of execution times. 

We assume that our idle threads do not retard progress of our busy threads 
(but take time from other concurrent applications instead), so the workload, i.e. 
number of busy threads, is given to the online player at any time and is beyond 
his influence. It is natural to assume fixed costs C and D for each create and 
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delete operation, respectively, and cost 1 for holding a thread one time unit long. 
W.l.o.g. let be C + D = 1 throughout the paper, that means, the time unit is 
chosen such that running an idle thread for 1 time unit is as expensive as a 
creation-deletion pair. We will find this choice to be very convenient. 

Our online problem can be stated as follows: How should we assign new jobs 
to idle threads and which of the idle threads should be deleted at what time, 
in order to minimize the total costs? For heavily loaded servers, savings on this 
front may have a measurable effect [ 2 ]. In view of the practical relevance, the 
strategy should not only be competitive, but also easy to implement and, most 
importantly, computationally simple. (It would be foolish to save thread costs, 
but to compensate this by too large amount of additional data structures for 
thread administration.) 

As we shall explain in Section 2, our problem is a generalization of the single 
rent-to-buy problem, where we need a resource for an unknown time interval 
and are allowed to rent it at price 1 per time unit or to buy it at an arbitrary 
moment at price 1. This problem and some of its variants are well-known under 
different names: leasing, spin-block, ski rental problem etc. Therefore we call our 
problem multiple spin-block. 

Trivially, the best deterministic competitive ratio for single rent-to-buy is 2. 
In contrast, there is an e/(e — l)-competitive randomized algorithm against an 
oblivious adversary [5j. Throughout the paper this is called the KMMO algo- 
rithm. Note that any randomized rent-to-buy algorithm is nothing else than a 
probability distribution on the points b in time at which we shall buy the re- 
source. In the KMMO algorithm, the probability to buy before & is a continuous 
function, namely (e — l)~^e*dt. Under the assumption that evenly distributed 
random reals X from [0, 1] are available, we may set b = ln(l -h (e — l)Ai). 

In [B] a sequence of isolated rent-to-buy decisions is studied under the as- 
sumption that requests follow an unknown but fixed probability distribution, 
and the goal is to adapt the online player’s strategy to this distribution; cf. this 
paper for further motivations and for pointers to empirical studies. In contrast, 
we consider concurrent threads which overlap in time, and we do not make pro- 
babilistic assumptions. The TCP acknowledgment delay problem [3j is of similar 
flavour as ours. The difference is that each acknowledgment has unit cost re- 
gardless the number of acknowledged packets, whereas our operations create or 
delete only one thread each. The call admission problem is also different, as re- 
quests may be rejected due to limited capacities, and the goal is to maximize 
the throughput (see e.g. a)- For a general introduction to competitive analysis 
of online problems we refer to [T]. 

We believe that the main contribution of the present paper is a scheme of r- 
competitive multiple spin-block algorithms, with r arbitrarily close to e/(e — 1). 
It is based on KMMO, but barely random and computationally simple. The 
proofs in this extended abstract are only sketched. 
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2 Decomposition of the Multiple Spin-Block Problem 

The only relevant information from the input is a staircase function from the 
reals (time) into the nonnegative integers (number of running jobs), called the 
workload function /. Note that / = 0 outside the finite interval from the arrival 
of the first job until completion of the last job. Similarly, the outcome of a 
deletion algorithm is a function g > f indicating the number of threads at each 
time. 

We say that a staircase function g has a downwards step at t if g decreases at 
t. Similarly, g has an upwards step at t if increases at t. There may be several up- 
wards or downwards steps at the same t. The ordinate of an upwards/downwards 
step is the function value before/after the step. A down-up pair is a downwards 
step together with the next upwards step on the same ordinate. (Think of mat- 
ching open and close parentheses in an arithmetic expression.) The width of 
a down-up pair is the time distance between the downwards and upwards step. 
Note that max* f(t) upwards and downwards steps, respectively, are not involved 
in down-up pairs. We may consider them as down-up pairs of infinite width. 

The cost of g obviously consists of the following summands: C times the 
number of upwards steps, D times the number of downwards steps, and the area 
bounded by the graph of g and the time axis. (Since the costs of busy threads 
must be paid anyhow, we may replace the last term with the area between the 
graphs of g and /. In this case we only consider the overhead for create/delete 
operations and idle threads.) The optimal offline cost for a load function / is the 
minimum cost of some g with g > f . 

The following offline algorithm is called BRIDGES: Consider any workload 
function /. We fix g along the time axis. Start with g = 0 before the left endpoint 
of I. Whenever g = f and / increases then, clearly, we further keep g = f ■ But if 
g = f and / decreases at t then g changes to value min{/(t), maxo</t<i f{t+h)}. 
(Finally g becomes 0.) Figuratively speaking, g builds bridges over all valleys of 
/ shorter than 1, and g = f elsewhere. Note that the lookahead of BRIDGES is 
bounded by C -I- D = I. 

Lemma 1. For every workload function, BRIDGES yields the unique optimal 
solution. 

Proof. Consider an arbitrary algorithm. It necessarily incurs cost C for every 
upwards step which is the first on its ordinate. Similarly, it incurs cost D for 
every downwards step which is the last on its ordinate. For every down-up pair 
of width w, it pays either -|- 1 if it deletes a thread h time units after the 
downwards step and opens a new thread with the corresponding upwards step, 
or it pays w if it lets the thread spin. Hence the best is to delete one thread 
immediately if ru > 1 and to simply continue if ru < 1. This is exactly what 
BRIDGES does, so optimality follows. A more formal proof would use induction 
on the number of steps. □ 

Next we present two straightforward but different generalizations of the 2- 
competitive deterministic solution to the rent-to-buy problem. The first one is: 
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EXPIRY DATE STACK 
Maintain a stack of threads. 

(1) When a thread becomes idle at time t, assign expiry date t + 1 to it, and add 
it to the stack. 

(2) When a new job arrives, assign it to the last idle thread on the stack and 
remove the now busy thread from the stack. 

(3) When an expiry date is reached, remove the thread from the bottom (!) of 
the stack and delete it. 

Note that the expiry dates are monotone on the stack. Although (3) is not a 
stack operation, we use the term “stack” to stress the fact that idle threads are 
added and assigned to jobs on a last-in-first-out basis. If threads are deleted by 
a central supervisor, it doesn’t matter which thread is deleted or receives a new 
job. Then only the expiry dates must form a stack. 

The function s produced by this algorithm mimics the monotone decreasing 
parts of / with delay 1, thereby keeping s > f. 

Theorem 1. EXPIRY DATE STACK is 2 -competitive. 

Proof. EXPIRY DATE STACK incurs cost 2 for each pair of an upwards and 
downwards step which is the first and last, respectively, on its ordinate. This 
is clear since it waits for 1 time unit after every such downwards step before 
deleting one thread. The optimum would be 1. Additionally, for every down-up 
pair of width w, EXPIRY DATE STACK obviously pays w if w < I, and 2 else, 
whereas the optimum cost is min{rc, 1}. Together this implies the result. A more 
formal proof might be given by induction. □ 

EXPIRY DATE STACK is easy enough to implement. However it might be 
more convenient to maintain only constantly many numbers in order to fix the 
next expiry date, instead of a stack. This suggests a simpler algorithm: 

CUMULATIVE IDLE COSTS 

Start with H = 0. Whenever the total cost H of holding the idle threads reaches 
1, delete one thread and reset H to zero. 

Clearly, the next expiry date can be computed in advance from the current 
H and the number of idle threads, and can easily be recomputed whenever the 
number of idle threads changes. So one has always to store only two numbers. 

Theorem 2. CUMULATIVE IDLE COSTS is also 2 -competitive. 

Proof. Let / be the given load function, g the function that BRIDGES would 
produce, and h the function produced by CUMULATIVE IDLE COSTS. We 
charge every downwards step of h with costs at most 2, namely D for the do- 
wnwards step itself, 1 for the idle threads until deletion of the next thread, and 
C for the next upwards step on the same ordinate (if existing). Note that every 
downwards step of h which is not below the graph of g can be considered as a 
delayed downwards step of g at the same ordinate. Since BRIDGES pays C and 
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D for every upwards and downwards step of g, CUMULATIVE IDLE COSTS 
pays at most twice the optimum as long as h > g. For time intervals with h < g, 
the argument for ratio 2 is a bit different: The total cost incurred by BRIDGES 
on such intervals is the area between the graphs of g and /. The payments of 2 
made by CUMULATIVE IDLE COSTS correspond to disjoint subareas of the 
mentioned area, each of size I. □ 

Remarks: 

(1) Since 2 is the optimal competitive ratio already for the special case of the 
single rent-to-buy problem, these algorithms are strongly competitive. 

(2) Competitive ratios are understood with respect to the costs of create/delete 
operations and idle threads only. As an immediate corollary, the algorithms are 
also 2-competitive with respect to the costs of create/delete operations, idle and 
busy threads. 

A competitive ratio below 2 can be obtained by randomization. Consider 
an arbitrary but fixed randomized algorithm R for rent-to-buy, with expected 
competitive ratio r against an oblivious adversary. For example, take the KMMO 
algorithm with r = e/(e — 1) « 1.58. 

An obvious randomized version of CUMULATIVE IDLE COSTS might come 
first into mind: 

CUMULATIVE IDLE COSTS (R) 

Start with H = 0. Proceed with the total cost H of holding the idle threads as 
R would do with the rent cost. When R buys then delete one thread and reset 
H to zero. 

Unfortunately this attempt fails, as the following heuristic consideration 
shows. Let h be the (random) function produced by CUMULATIVE IDLE 
COSTS {R). Consider a “canyon” in /, consisting of n downwards steps at the 
same time, followed by n upwards steps t time units later, where t <\. BRID- 
GES pays tn there. Since R has expected competitive ratio r, the expected area 
below h until the {k + l)-th deletion inside the canyon is X^r=ra-fc(^ ~ I)/*- 
For large enough n we may assume that the number k of deletions satisfies 
— l)/z « t. Since the harmonic numbers grow as the In function, this 
gives k « n(e*^ — l)/e^, where y = t/{r — l). Since CUMULATIVE IDLE COSTS 
(i?) pays kr (C + D = 1 for every deletion and creation, and k{r— 1) for the area 
below h), we would obtain a competitive ratio kr/tn « r/(r — 1) for / consisting 
of a sequence of such canyons with small t. Ironically, this is larger than 2 just 
because of r < 2. This example suggests to pay attention to the moments when 
the threads became idle. So we argue that randomization of EXPIRY DATE 
STACK is the proper way. 

EXPIRY DATE STACK (i?) 

Maintain a stack of threads. 

(1) When a thread becomes idle, add it to the stack. 

(2) When a new job arrives, assign it to the last idle thread in the stack and 
remove the now busy thread from the stack. 
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(3) Simultaneously delete idle threads (at any position!) according to R, and 
always retain the ordering of surviving threads on the stack. 

Theorem 3. EXPIRY DATE STACK (R) is r- competitive. More precisely, the 
expected cost is at most r times the optimum, in each down-up pair of the wor- 
kload function. 

Proof. Consider a “stubborn” version of EXPIRY DATE STACK (R) where the 
list elements do not move after deletion of middle elements. Instead we leave 
the gaps on the stack, standing for idle threads which have already expired, and 
when such a thread is requested from the end of the list then we create a new 
one. For this algorithm, r-competitiveness follows quite easily from linearity of 
expectation, if we decompose the workload function into down-up pairs. It is 
fairly obvious that the original algorithm is not more costly: It picks up the next 
idle thread from the stack as long as there is one. Hence creations are postponed 
or even avoided, and idle costs are saved. From this the assertion follows. □ 

EXPIRY DATE STACK (i?) is highly random if we apply R independently to 
all threads that become idle. On the other hand, the proof of r-competitiveness 
does not rely on independent applications of R at all. (Linearity of expectation 
holds for arbitrary random variables.) So we might even fix a single b at ran- 
dom, and then use it everywhere! The obvious drawback is that certain workload 
functions can foil such a solution, i.e. produce an actual competitive ratio sig- 
nificantly larger than r. (For example, consider a 0,1- valued / where the / = 0 
intervals have length b -\- e.) On the other hand, independent applications of R 
make the competitive ratio sharply concentrated around r for any large enough 
input. So we have a trade-off between randomness and variation. Nicely, we can 
remove most randomness from EXPIRY DATE STACK (i?) yet keeping the com- 
petitive ratio sharply concentrated at r. For this however R must be a discrete 
rent-to-buy algorithm as we introduce next. 

3 Discrete Randomized Rent-to-Buy 

We return to the single rent-to-buy problem. 

Definition 1. A discrete rent-to-buy algorithm R with denominator n is a pro- 
bability distribution on a set of n points 0 < < . . . < sueh that R buys, 

for all k, after tk time units with probability 1/n. Let r denote the expeeted 
eompetitive ratio of R. 

The question is to find points tk so as to minimize r, for fixed denominator 
n. This is also interesting for its own, since a discrete strategy does not need real 
computations, unlike the continuous KMMO algorithm. The tk can be computed 
once and stored in a table. Note that our problem is “orthogonal” to that of 
randomized snoopy caching solved in where tk = k/n are fixed and the 
probabilities are the variables. 
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Let t denote the duration (unknown to the online player) the resource is 
needed for. For convenience let tg = 0 and t„,+i = oo. We express r by r = 
maxrfc where is the worst-case competitive ratio if tfc < t < tfc-i-i. Note 
that tq = 1 is redundant. Define Sk = The expected cost R incurs 

is (sfc + k + {n — k)t)/n. If tfc+i ^ 1 then is the cost divided by t, which is 
maximized iit = Hence = {{sk + k) /tk + n—k) jn in this case. If 1 < tfc then 
rfe is maximized if f = tk+i-, hence rk = {sk + k + {n — k)tk+i)/n, particularly 
T'n = (sn + n) /n. Finally, if f/c < 1 < tk+i then rk is the maximum of both 
expressions. 

Lemma 2. In an optimal R with denominator n we have tn = I, and 
r = 1 -|- n~^ maxfc((sfc -I- k) /tk — k) . 

Proof. Assuming > 1, let u be that index with t^ < 1 < tu+i- The tk, k > u do 
not appear in the rk, k < u, and the rk, k > u are monotone increasing in all 
Hence for any fixed t\, . . . ,tu get minimum r if tu+i = . . . = tn instead of the 
given values. This shows < 1. We conclude that rk = {{sk + k)/tk + n — k)/n 
for all k. Now assume < 1. If we multiply all ti by the same factor a > 1 
then all rk decrease. Hence r is minimized if we choose a possibly large, but this 
means = 1. □ 

Since Vk is monotone decreasing in tk and monotone increasing in all previous 
ti, we get minimum r if ri = ... = r„. So Lemma [2 yields n — 1 algebraic 
equations in n — 1 variables tk, k < n. We illustrate the application in case 
n = 2: 

Proposition 1. The best R with denominator 2 , given hyt\ = (\/5— 1)/2 ~ .62, 
has competitive ratio (5 -I- \/5)/4 1.81. 

Proof. By Lemma[2l ri = 1-|- l/2ti and r 2 = 1-|- (ti -I- l)/2. Thus t\+ti — l = t). 

□ 



Denominators n > 2 lead to higher-order algebraic equations that may be 
solved numerically. For n = 3 we get ti fv .45, ^2 ~ -78, and r k, 1.75, etc. It 
is important to estimate the competitive ratio for fixed n. Due to the following 
result, some n between 5 and 10 should be satisfactory in practice: 

Theorem 4. e/(e — 1) -I- l/2n — 1/n^ < r < e/(e — 1) -I- l/2n. 

Proof. Choose R with tk = ln(l + (e — l)k/n). Recall the following properties of 
any R with < 1: We have r = maxrfc where the rk are the expressions from 
Lemma 121 and the worst-case ratio of expected cost and t occurs at some t = tk. 
Now we compare the expected costs of R and KMMO at any fixed t = tk. By 
our choice of tk, the probability to buy until tk is k/n in both algorithms. It 
remains to compare the expected rent time. The probability to buy the resource 
in interval {ti,ti+\] is 1/n, but R defers the buy decisions of KMMO until ti+i. 
Hence the contribution of every such interval (for 0 < i < fc) to the expected 
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rent time of R exceeds that of KMMO by at most (ti+i — ti) /2n. (Since the 
density function e*/(e — 1) used in KMMO is monotone, the average delay is at 
most the half interval length.) Hence the total excess is at most tfe/2n. Since in 
rk the costs are divided by tk, the assertion follows. 

For the lower bound, consider any discrete rent-to-buy algorithm R' . By 
Lemma[2|we may w.l.o.g. assume = 1. Let k be the first index with t). < tk- 
Clearly, r' >r'^ >rk - Hence it suffices to prove the asserted lower bound for each 
rk of our particular R. Similarly as above, it is not hard to observe geometrically 
that the contribution of interval to the excess of expected rent time is 

at least a(ti+i — ti)/2n, where a > 2n/{2n + e— 1). (Figuratively speaking, since 
the density function is monotone, at least a 2e*’/(e*’+^ + e*') fraction of the 
probability mass has an average delay of exactly the half interval length. Since 
e** = 1 + (e — l)i/n, the worst case is with i = 0, which yields our factor a.) 
From this we obviously get the square term. □ 

A tighter analysis may improve the lower bound. 

4 Barely Random Multiple Spin-Block Algorithms 

Let i? be a fixed discrete rent-to-buy algorithms with denominator n and < 1. 
Our scheme works as follows. Choose a fixed permutation tt of the first n positive 
integers, and a real parameter h < t\, may be h = t\. Divide the time axis by 
lattice points into slices of length h. Let T{1) be the set of m threads that became 
idle between I — h and I, and are still idle at I (that means, are not removed 
again from the stack by new jobs). Note that R would not delete any jobs from 
T{1) before I, since they have been idle for less than ti time units. So we may 
fix the expiry dates of all idle threads in T{1) still at time 1. This is done in the 
following way. Put the threads of T(l) in a round robin fashion into n subsets, 
in the ordering they became idle. In other words, any n consecutive threads of 
T{1) belong to different subsets. Choose a random integer x G {!,..., n}. Assign 
idle time v = t,r(fc)+a: mod n -I- 1 to all threads in the fc-th subset. (That means, 
the expiry date of a thread is rt -I- u if it became idle at u.) 

Thus every idle thread in T{1) is deleted according to R, just as in EXPIRY 
DATE STACK (R). Due to Theorem [3] and linearity of expectation, this version 
of EXPIRY DATE STACK (i?) is also r-competitive. Moreover it uses a fixed 
number of independent random integers per time unit, regardless of the workload 
function /. Most pleasantly, r is not only the expected competitive ratio. We 
can give a stronger guarantee if the workload is high: 

Proposition 2. In every time unit U , the total cost of all but constantly many 
down-up pairs of f starting in U is surely (!) at most r times the optimum. 

Proof. For simplicity consider the stubborn version of the algorithm, as in Theo- 
rem O So every thread is assigned to a fixed down-up pair of /, as soon as it 
becomes idle. Partition every T{1) into T{l)k (0 < fc < n), the sets of threads 
in T(l) being responsible for down-up pairs of / of width between tk and tk+i- 
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These are contiguous subsets of threads on the stack. Further partition each 
T(l)k into contiguous blocks of exactly n threads. There remain less than n 
threads in each at most threads in each T{1), and 0{n?) threads in 

each time unit, since ~ (e — 1) jn in an optimum R. (The latter is not hard to 
see from previous section.) Due to the “modulo construction”, each block con- 
tains exactly one thread with idle time tj (1 < j < n). From this, the formula 
in Lemma [2] implies that the cost of each block in any T(l)k is within times 
the optimum. Finally note rfe < r. □ 

Hence, though the amount of randomness per time is constant, the compe- 
titive ratio concentrates around r the better, the more down-up pairs / has. 
Roughly speaking, this follows by standard arguments (Chernoff bounds) in the 
“horizontal direction” (time), and from Proposition [5] in the “vertical direction” 
(workload) . 

It is worthwhile to mention some implementation details: 

(1) Note that Proposition |2]does not rely on the fixed cyclic ordering tt of the 

tk values. Thus we are free to choose tt such that the tk are “well mixed” in the 
following sense: For each k and m, all segments of length m in the cyclic ordering 
should contain almost the same number of ti values with i < k. It is intuitively 
clear that this little effort improves the variance of costs on the 0{n^) remaining 
threads mentioned in Proposition The following is a good choice for any n: 
Let ^ = (3 — \/5) /2 (thus 1 — (/) is the golden ratio). Define a total ordering ^ by 
ti -< tj iff i(j) — \}4>\ < ~ and use the cyclic ordering obtained from 

(2) In the proof it was convenient to assume that idle times are assigned to 
all threads in T(Z) at time Z, after a random cyclic shift in tt, but we may also 
assign idle times according to tt immediately whenever a thread becomes idle, 
and apply a random shift to tt after every h time units. This simplifies the code. 

(3) A crucial point in a real implementation is to handle the deletion of 
threads. Using a timer for each thread is out of the question. It is too expensive 
and may even cause a server crash [ 2 ] . Instead we should use n timers, one for 
each set Tj. of idle threads whose idle time is fixed to be tk- The k-th timer 
notifies the program if the earliest expiry date in Tk is reached. Then we delete 
one thread and set the alarm for the next expiry date in Tk which is found in 
n (i.e. constantly many) expected steps: Since the tk are drawn from a cyclic 
ordering, interrupted by random shifts, we may simply search the stack bottom- 
up, until we meet a thread from Tk- To appreciate this property, note that in 
the basic version of EXPIRY DATE STACK (i?) (cf. Theorem [H)) we have no 
such guarantee, so we would need an additional data structure to quickly find 
the next expiry date in the stack. 
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Abstract. Many applications in parallel processing have to traverse 
large, implicitly defined trees with irregular shape. The receiver initiated 
load balancing algorithm random polling has long been known to be very 
efficient for these problems in practice. For any e > 0, we prove that its 
parallel execution time is at most (l-|-e)rseq/f’+0(T’atomic + h(i-|-rrout + 
Tspiit)) with high probability, where Trout, T^put and Tatomic bound the 
time for sending a message, splitting a subproblem and hnishing a small 
unsplittable subproblem respectively. The maximum splitting depth h is 
related to the depth of the computation tree. Previous work did not prove 
efficiency close to one and used less accurate models. In particular, our 
machine model allows asynchronous communication with nonconstant 
message delays and does not assume that communication takes place in 
rounds. This model is compatible with the LogP model. 



1 Introduction 



Many algorithms in operations research and artificial intelligence are based on 
the backtracking principle for traversing large irregularly shaped trees that are 
only defined implicitly by the computation 131416191121131141191171211^ . Similar 
problems also play a role in parallel programming languages [nm. Even for loop 
scheduling and some numerical problems EH] like adaptive numerical integra- 
tion |2S] it can be useful to view the computations as an implicitly defined tree 
(refer to |S1] for a more detailed discussion of examples). 

For parallelizing tree shaped computations, a load balancing scheme is needed 
that is able to evenly distribute the parts of an irregularly shaped tree over 
the processors. It should work with minimal interprocessor communication and 
without knowledge of the shape of the tree. Load balancers often suffer from the 
dilemma that subtrees which are not subdivided turn out to be too large for 
proper load balancing whereas excessive communication is necessary if the tree 
is shredded into too many pieces. 

We consider random polling dynamic load balancing [19] (also known as ran- 
domized work stealing j5HUI2llT] '). a simple algorithm that avoids both problems: 
Every processing element (PE) handles at most one piece of work (which may 
represent a part of a backtracking tree) at any point in time. If a PE runs out of 
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work, it sends requests to randomly chosen PEs until a busy one is found which 
splits its piece of work and transmits one to the requestor. 

We continue this introduction by explaining the machine model in Section f1 .1 1 
and the problem model tree shaped eomputations in Section II .21 Section II .31 
reviews related work and summarizes the new contributions. The main body of 
the paper begins with a more detailed description of random polling in Section |21 
In Section Owe then give expected time bounds and show in Section O that they 
also hold with high probability (using additional measures). Finally, Section 0 
summarizes the paper and discusses some possible future research. 



1.1 Machine Model 

We basically adopt the LogP model [S] due to its simplicity and genericity. There 
are P PEs numbered 0 through P—1. We assume a word length of l7(log P) bits[i] 
Arithmetics on numbers of word length - including random number generation 
- is assumed to require constant time. All messages delivered to a PE are first 
put into a single FIFO message queue. In the full LogP model, three parameters 
for “latency” L, “overhead” o and “gap” g contribute to the cost of message 
transfer. We make the more conservative assumption that sending and receiving 
messages always costs T^out := L + o + g units of time. So the analysis also 
applies to the widespread messaging protocols that block until a message has 
been copied into the message queue of the recipient. 



1.2 Tree Shaped Computations 

We now abstract from the applications mentioned in the introduction by in- 
troducing tree shaped eomputations which expose just enough of their common 
properties in order to parallelize them efficiently. All the work to be done is in- 
itially subsumed in a single root problem /root- /root is initially located on PE 0. 
All other PEs start idle, i.e., they only have an empty problem / 0 . 

What makes parallelization attractive, is the property that problem instances 
can be subdivided into subproblems that can be solved independently by different 
PEs. For example, a subproblem could be “search this subtree by backtracking” 
or “integrate function / over that subinterval”. We model this property by a 
splitting operation split(/) that splits a given (sub)problem I into two new sub- 
problems subsuming the parent problem. Let Tsput denote a bound on the time 
required for the split operation. For example, in backtracking applications a sub- 
problem is usually represented by a stack and splitting can be implemented by 
copying the stack and manipulating the copies in such a way that they represent 
disjoint search spaces covering the original search space |26| . 

The operation work(/, t) transforms a given subproblem / by performing 
sequential work on it for t time units. The operation also returns when the 
subproblem is exhausted. 

^ Throughout this paper log a; stands for logj®. 
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What makes parallelization difficult, is that the size, i.e., the execution time 
T{I) := min{t : work(J, t) = / 0 }, of a subproblem cannot be predicted. In ad- 
dition, the splitting operation will rarely produce subproblems of equal size. 
For the analysis we assume however that VJ : split(/) = (Ji,/ 2 ) T{I) = 

T{Ii)+T{l 2 ) regardless when and where I\ and I 2 are worked on. For a discussion 
when this assumption is strictly warranted and when it is a good approximation, 
refer to Section 0 and to |32I34| . 

Next we quantify some guaranteed “progress” made by splitting subpro- 
blems. Every subproblem I belongs to a generation gen(I) recursively defined 
by gen(iroot) := 0 and split(J) = (Ji,/ 2 ) ^ gen(/i) = gen(l 2 ) = gen(/) -h 1 . 
For many applications, it is easy to give a bound on a maximum splitting depth 
h which guarantees that the size of subproblems with gen(J) > h cannot ex- 
ceed some atomic grain size Tatomic- For example, a backtracking search tree of 
depth d and maximum branching factor b is easy to split in such a way that 
h < d [log b~\ . We want to exclude problem instances with very little paralle- 
lism and therefore assume h > log P. Otherwise, we might quickly end up with 
less than P atomic pieces of work that cannot be split any more. Since h is 
the only factor that constrains the shape of the emerging “subproblem splitting 
tree”, it can be viewed as a measure for the irregularity of the problem instance. 
(Obviously, very regular instances with large h are possible. But in applications 
where this is frequently the case, one should perhaps look for a splitting function 
exploiting these regularities to decrease h.) 

Finally, subproblems can be moved to other PEs by sending a single message. 
If problem descriptions are long, the parameters of the LogP model must be 
adapted to reflect the cost of such a long message. The resulting time bounds 
will be conservative since many messages are much shorter. 

The task of the algorithm analysis is now to bound the parallel execution 
time Tpar required to solve a problem instance of size Tseq := T(/i.oot) given the 
problem parameters h, Tsput and Tatomic and the machine parameters P and 
Trout- The bound is represented in the form 

Tpar < (1 + e) + Trest(T, Trout) C,h, . . .) (1) 

where e > 0 represents some small value we are free to choose. So, for situations 
with Trest Tseq/ P we have a highly efficient parallel execution. 



1.3 Related Work and New Results 

There is a quite large body of related research so that we can only give a rough 
outline. Many algorithms use a simpler approach regarding tree decomposition 
by requiring all “splits “ to occur before calls to “work” (in our terminology). 
However, this is only efficient for some applications since in the worst case a 
huge number of subproblems may have to be generated or communicated (e.g. 

H17127]). 

Random polling belongs to a family of receiver initiated load balancing al- 
gorithms which have the advantage to split subproblems only on demand by 
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idle PEs. This adaptive approach has been used successfully for a variety of 
purposes such as parallel functional [l] and logic programming |16J or game 
tree search [12|. Randomized partner selection goes at least back to [T^. The 
partner selection strategy turns out to be crucial. The apparently economic 
option to poll neighbors in the interconnection network can be extremely in- 
efficient since it leads to a buildup of “clusters” of busy PEs shielding large 
subproblems from being split [26]. Polling PEs in a “global round robin” fas- 
hion m avoids this because no large subproblems can “hide”. Execution times 
Tpar G 0{-^ + hTcount) Can be achieved where Tcount is the time for incre- 
menting a global counter. However, even sophisticated distributed counting al- 
gorithms have Tcount G (Trout log T/ log log P) [36|. It was long known that 
random polling performs better than global round robin in practice although 
the first ana^tical treatments could only prove an asymptotically weaker bound 
ETpar G 0(-p3- -I- hTroutlogP) [T^ . Tree shaped computations are a generaliza- 
tion of the a-splitting model used in 1181 . The gap between analysis and practical 
experience was closed in |28I29| by showing that Tpar < (1 + + 0{hT^oMt) 

with high probability using synchronous random polling. 

Slightly later, random polling (also called randomized work stealing) was fo- 
und to be very efficient for scheduling multithreaded computations [5| . For many 
underlying applications, the two models can be translated into each other. The 
critical path length Too in multithreaded computations then becomes hTs-pwt + 
Tatomic for tree shaped computations. Multithreading can model predictable de- 
pendencies between subproblems while tree shaped computations allow for diffe- 
rent splitting strategies which may significantly decrease h [26]. Multi-threaded 
computations are most easy to use with programming language support, while 
tree shaped computations are directly useful for a portable and reusable library 
I3TI34I . In the following, we concentrate on tree shaped computations. Adapting 
these results to multithreading or some more general model encompassing both 
approaches is an interesting area for future work however. 

All the analytical results above (including |28I29| ) make simplifying assump- 
tions that are unrealistic for large systems, difficult to implement or detrimental 
to practical performance. The most common assumption is that communication 
takes place in synchronized communication rounds. This is undesirable since idle 
PEs have to wait for the next communication round and the network capacity 
is left unexploited most of the time. In fact, actual implementations are usually 
asynchronous. Arora et al. allow small speed fluctuations {2C-iC instructions 
per round) but even that may be diffucult to attain since the number of clock cy- 
cles needed per instruction can be highly data dependent on modern processors 
(e.g., cache faults for large inputs). They also assume that polling and splitting 
take constant time (Trout + Tgpiit S 0(1) in our terminology). This is a viable 
assumption for moderate size shared memory machines and the thread stack of 
a multithreaded language. But we want an algorithm that scales to large dis- 
tributed memory machines and allows more sophisticated application specific 
splitting functions. Using an even simpler stochastic model, Mitzenmacher was 
able to analyze many variants of work stealing [I28J . 
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Unfortunately, we cannot fully transfer an analysis for the above “round mo- 
dels” to a realistic asynchronous model since subproblems that are “in transit” 
cannot be split and long request queues can build up around PEs that have 
“difficult to split” subproblems. In Section |3] we solve these analytical problems 
and show that ETpar <(1 + e)Tseq/P + C’(T'atomic + h (1/e -I- T^out + Pspiit))- In 
Section [4| it turns out that this bound also holds with high probability although 
for some values of h it may be necessary to actively trim long queues. 

The time bound is not only tight for random polling but in m we also 
show a number of lower bounds which come very close: There are tree shaped 
computations for which a splitting overhead of l7(hTspiit) is unavoidable so that 
we get an l?(Tseq/P -I- T/tomic + ^T/piit) lower bound. Furthermore, any recei- 
ver initiated load balancing algorithm not only needs [2(h) communications on 
the critical path but also f2{hP) full size messages overall so that the network 
bandwidth is fully utilized. Wu and Kung show that a similar bound holds 
for all deterministic algorithms. Random polling can be slightly improved on 
certain networks by carefully increasing the average locality of communication 
l30l . At least up to constant factors, similar results can be achieved by dynamic 
tree embedding algorithms (e.g. [TSl l. 



2 The Algorithm 

Figure [1] gives pseudo-code for the basic random polling algorithm. PE 0 is 
initialized with the root problem as specified in the model. PEs in possession 
of nonempty subproblems do sequential work on them but poll the network for 
incoming messages at least every At time units and at most every aAt time 
units for any constant a < id When a request is received, the local subproblem 
is split and one of the new subproblem is sent to the requestor. Idle PEs send 
requests to randomly determined PEs and wait for a reply until they receive a 
nonempty subproblem. Requests received in the meantime are answered with an 
empty subproblem. Note that an empty subproblem can be coded by a short 
message equivalent to a rejection of the request. 

Concurrently, a distributed termination detection protocol is run that reco- 
gnizes when all PEs have run out of work. We have adapted the four counter 
method ^2\ for this purpose. Each PE counts the number of sent and received 
messages that contain nonempty subproblems. When the global sum over these 
two counts yields identical results over two global addition rounds, there cannot 
be any work left (not even in transit). Instead of the ring based summing scheme 
proposed in m, we use a tree based asynchronous global reduction operation. 
This is a simple and portable way to bound the termination detection delay by 
O(rroutlogP). 

^ If the machine supports it, explicit polling can be replaced by more efficient and 
more elegant interrupt mechanisms which (almost) only cost time when requests 
arrive. 
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var I, I' : Subproblem 
I := if ipB = 0 then /root else /g 
while no global termination yet do 
if T{I) = 0 then 

send a reqnest to a random PE 
repeat 

receive any message M (blockingly) 
reply requests from PE j with /^ 
until M is a reply to my request 
unpack I from M 
else I ■.= work(/, 

if incoming request from PE j then (/, I') := split(J); send I' to PE j 



Fig. 1. Basic algorithm for asynchronous random polling. 



3 Expected Time Bounds 

This Section is devoted to proving the following bound on the expected parallel 
execution time of asynchronous random polling dynamic load balancing: 

Theorem 1. ETpar < (1 + e)~p^ + C^(T'atomic + ^ ( e + + Tspiit)) for an 

appropriate choice of At. 

The basic idea for the proof is to partition the execution time of each individual 
PE into intervals of productive work on subproblems and intervals devoted to 
load balancing. We first tackle the more difficult part and show that a certain 
overall effort on load balancing suffices to split all remaining subproblems at least 
h times. By definition of h this implies that they are smaller than Tatomic- As 
a preparation, we assign a technical meaning to the terms “ancestor” , “arrive” 
and “reach”: 

Definition 1. The ancestor of a subproblem I at time t is the uniquely defined 
subproblem from which I was derived by applying the operations “work” and 
‘hplit”. A load request arrives at the point of time t when it is put into the 
message queue of a PE. A load request reaches a subproblem I at time t if it 
arrives at some PE at time t and (later) leads to a splitting of I. 

We start the analysis by bounding the expense associated with sending and 
answering individual requests: 

Lemma 1. 

1. The total amount of active CPU work expended for processing a request is 
bounded by Tgpiit + O (Trout )• 

2. If any requests have arrived at a PE, at least one of the requests is answered 
every At + Tgpiit + O(Trout) time units. 

3. The expected elapsed time between the arrival of a message and sending the 
corresponding reply is in 0{At + Tspnt + Trout)- 
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Proof. 1: A request triggers at most one split. The total expense for sending 
and receiving is in O (Trout)- 2: An additional time of At for sequential work 
can elapse until the message queue is checked the next time. 3: Some queues 
might be long so that some request are delayed for a quite long time. However, 
there are at most P active requests at any point in time. A request arriving 
at a random PE will therefore encounter an expected queue length bounded by 
J2i<p “queue length at PE P /P < 1. m 

When a subproblem is split by one or more subsequent load request, there is 
a dead time interval during which it cannot be reached by any other request. 

Lemma 2. All dead times can he covered by associating a dead time Tdead = 
At + Tgpiit + O (Trout) with each request reaching a subproblem. 

Proof. Let I denote a subproblem that is reached by a request R at time t and 
at PE i. Let fc > 0 denote the number of requests in the message queue of PE i 
that reach / before R. Only if / is moved to another PE j due to R, I cannot 
be reached by any request arriving after t until I is put into the message queue 
of PE j. In the worst case, the dead time is {k + l){At + Tspia + Trout)- This is 
the case, when “work” has just been called for the ancestor of I. Then a time 
At passes until the load balancer is next activated. Subsequently, the ancestor 
is split with an expense of Tgpm and a subproblem is sent away. This cycle is 
repeated k + 1 times. Then / is reachable on PE j. The total dead time can be 
distributed over the k+l requests involved. ■ 

Now we know the various costs and delays associated with requests. If we 
could find out how many request are necessary to split all subproblems h times 
with high probability, we were almost done. However, the question is stated too 
imprecisely yet. Requests that arrive during a dead time of a subproblem are 
“lost” for that subproblem. We therefore only consider a subset of all completed 
requests that has the property to be “sufficiently uniformly” distributed over 
time. 

Definition 2. A request may be colored red if there are at most P other red 
requests during a time interval Tiead after its arrival. 

Lemma 3. Let I (i) denote the subproblem at PE i. For every /? > 0 there is a 
constant c > 0, such that after processing cPh red requests 

P [3z : gen(/ (z)) < h] < P~^ (for sufficiently large P). 

Proof. For some fixed PE index i, we have 

P [3z : gen(/ (z)) < h] < TP [gen(/ (z)) < h] . 

So it suffices to show that P [gen(/ (z)) < ft,] < P~d~^ for sufficiently large P. We 
can bound gen(/ (z)) by the number of red requests that reach / (z). Uncolored 
requests can be ignored here w.l.o.g.: Although it may happen that an uncolored 
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request reaches I (i) and causes one or more subsequent red requests to miss 
I (i), this split will be accounted to the next following red request and its dead 
time suffices to explain that the subsequent red requests miss I (i). Using a 
combinatorial treatment, we now show that < P~P~^ where 

Pfc := P [/ (z) is reached by k red requests] . 

There are ways, to choose k red request that are to reach I (i). The pro- 

bability that they are all heading for PE i is P~^. Since there are at most P red 
requests in the dead time after a request, there are at least chn — kP remaining 
red request that do not reach / (z) . The probability of this event is 

(^l _ l^pykP-kP ^ ^-{ch-k) 



All in all, we have 

Pk < 

using the Stirling approximation < (me/fc)^. Since k < h, it is easy to verify 
that the /c-dependent part of the above expression is monotonously increasing 
with k for c > 1/e and can be bounded from above by setting k = h, i.e., 

Pk < = e-’^(c-\nc-2) ^ 

Now P [gen(J (z)) < h] can be bounded by 

/jg-^(c-lnc-2) _ g-/t(c-lnc-2-is/;) ^ g-/t(c-ln c-2- i) 

Since we assume that h G l7(logP) there is a P such that h > c'lnP: 

P [gen(/(i)) <h]< p-d(c-lnc-2-i) < p-0-l 

for an appropriate c and sufficiently large P. ■ 

Now we bound the expense for all requests in order to have cPh red ones 
among them. 

Lemma 4. Let c > 0 denote a constant. Requests can he colored in such a way 
that an expected work in 0{hP{At + Tgpiit -I- Trout)) for all request processing 
suffices to process chP red requests. 

Proof. Let Pi, . . . , P„ denote all the requests processed and let t{Ri) < ■ ■ ■ < 
t{Rm)) denote the arrival time of R{. Going through this sequence of requests 
we color P subsequent requests red and then skip the requests following in 
an interval of Tdeadj etc. Since there can never be more than P requests in 
transit there can be at most 2P uncolored requests whose executions overlaps 
an individual red interval. Therefore, the expense for P red requests can be 
bounded by PTdead plus the expense for processing 3P requests. The expense 
for this is given in Lemma [T] ■ 
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By combining lemmata [3] and |4] we get a bound for the communication ex- 
pense of random polling until only atomic subproblems are left. 

Lemma 5. The expected overall expense for communicating, splitting and wait- 
ing until there are no more subproblems with gen(J) < h is in O (/iP(Z\t -I- Tgpiit -I- 

Trout ) ) • 

Bounding the expense for sequential work ~ i.e. calls of “work” - is easy. Let 
Tpoii denote the (constant) expense for probing the message queue unsuccessfully. 
It suffices to choose At > Tpou/(ea) to make sure that only (H-e)Tseq time units 
are spent for those iterations of the main loop where the local subproblem is not 
exhausted and no requests arrive. All other loop iterations can be accounted to 
load balancing. 

As the last component of our proof, we have to verify that atomic subpro- 
blems are disposed of quickly and that termination detection is no bottleneck. 

Lemma 6. If At G J?(min( ^ Trout + '?spiit)) and gen(/ (i)) > h for all 

PEs then the remaining execution time is in 

C*(2atomic + h{At Tg plit ^rout)) ■ 

Proof. From the definition of h we can conclude that for all remaining subpro- 
blems / we have T(J) < Tatomic- For G 0(Tlout + Tsput), 0{h) iterations 

(of each PE) with cost 0{At Tspiu -I- T^ont) each suffice to finish up all sub- 
problems. Otherwise, a busy PE spends at least a constant fraction of its time 
with productive work even if it constantly receives requests H Therefore, after 
a time in O (Tatomic) no nonempty subproblems will be left. After a time in 
C’(TroutlogT) C 0(/iTrout), the termination detection protocol will notice this 
condition. ■ 

The above building blocks can now be used to assemble a proof of Theorem |T] 
Choose some At G 0{T,ont + Tgput) n C(min ( + Tgpiit)) such that 
At > Tpo\\/{ea) (where Tpou is the constant time required to poll the network 
in the absence of messages). This is always possible and for the frequent case 
Tatomic/^ T^out “t” Tgpiit there is also a very wide feasible interval for At. 
Every operation of Algorithm [T] is either devoted to working on a nonempty 
subproblem or to load balancing in the sense of Lemma El Therefore, after an 
expected time of (l + e)%^+0(/i(l/e-|-Trout + Tspiit)) sufficiently many requests 
have been processed such that only subproblems with gen(J) > h are left with 
high probability. The polynomially small fraction of cases where this number of 
requests is not sufficient cannot influence the expectation of the execution time 
since even a sequential solution of the problem instance takes only 0{P) times 
as long as a parallel execution. According to Lemma El an additional time in 
(Tatomic + ^( 1/e + Tgpiit + Trout)) suffices to finish up the remaining subproblems 
and to detect termination. ■ 

^ In the full LogP model even At G Q f min( , max(Tspiit -I- o,g))j suffices. 
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4 High Probability 

In order to keep the algorithm and its analysis as simple as possible, Theorem [T] 
only bounds the expected parallel execution time. In |B3j it is also shown how 
the same bounds can be obtained with high probability. The key observation is 
that Martingale tail bounds can be used to bound the sum of all queue lenghts 
encounterd by requests if the maximal queue length is not too large. 

Theorem 2. For At and e as in Theorem [7] 

2par < (l + e)-p^+0 (^atomic + “ +4split + Trout )) 

if h G J7(PlogP) or queue lengths in are avoided by algorithmic me- 

ans^ 

5 Discussion 

Tree shaped computations represent an extreme case for parallel computing in 
two respects. On the one hand, parallelism is very easy to expose since subpro- 
blems can be solved completely independently. Apart from that they are the 
worst case with respect to irregularity. Not only can splitting be arbitrarily un- 
even (only constrained by the maximum splitting depth h) but it is not even 
possible to estimate the size of a subproblem. Considering the simplicity of ran- 
dom polling and its almost optimal performance (both in theory and practice) 
the problem of load balancing tree shaped computations can largely be conside- 
red as solved. 

Although tree shaped computations span a remarkably wide area of appli- 
cations, an important area for future research is to generalize the analysis to 
models that cover dependencies between subproblems. The predictable depen- 
dencies modeled by multithreaded computations [2] are one step in this direction. 
But in many classic search problems the main difficulty are heuristics that prune 
the search tree in an unpredictable way. 
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Abstract. Hypergraph 2-colorability, also known as set splitting, is a 
widely studied problem in graph theory. In this paper we study the ma- 
ximization version of the same. We recast the problem as a special type 
of satisfiability problem and give approximation algorithms for it. Our 
results are valid for hypergraph 2-colorability, set splitting and MAX- 
CUT (which is a special case of hypergraph 2-colorability) because the 
reductions are approximation preserving. Here we study the MAXNA- 
ESP problem, the optimal solution to which is a truth assignment of the 
literals that maximizes the number of clauses satisfied. As a main result 
of the paper, we show that any locally optimal solution (a solution is 
locally optimal if its value cannot be increased by complementing as- 
signments to literals and pairs of literals) is guaranteed a performance 
ratio of | -f e. This is an improvement over the ratio of | attributed 
to another local improvement heuristic for MAX-CUT |^. In fact we 
provide a bound of for this problem, where k > 3 is the minimum 
number of literals in a clause. Such locally optimal algorithms appear 
to subsume typical greedy algorithms that have been suggested for pro- 
blems in the general domain of satisfiability. It should be noted that the 
NAESP problem where each clause has exactly two literals, is equiva- 
lent to MAX-CUT. However, obtaining good approximation ratios using 
semi-definite programming techniques [3] appears difficult. Also, the ran- 
domized rounding algorithm as well as the simple randomized algorithm 
both [ 4 ] yield a bound of | for the MAXNAESP problem. In contrast to 
this, the algorithm proposed in this paper obtains a bound of | -|- e for 
this problem. 

Keywords: Approximation Algorithms, Hypergraph 2-colorability, Set 
Splitting, MAXNAESP, MAX-CUT. 



1 Introduction 

Hypergraph 2 -colorability^ is defined as follows: 

INSTANCE: Given a collection C of subsets of finite set S. 

QUESTION: Is there a partition of S into two subsets and S2 such that 
every set in C has elements from and S'2. 
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The maximization version of this problem corresponds to finding two sets Si 
and S 2 such that the maximum number of sets in C have elements from Si and 
^2. 

MAX-CUT is the problem of partitioning the nodes of an undirected graph 
G = (V,E) into two sets S and V — S such that there are as many edges 
as possible between S and V — S. Both hypergraph 2-colorability and MAX- 
CUT are NP-Complete[6]. We recast the maximization version of hypergraph 
2-colorability as a maximization version of a special type of satisfiability problem 
called MAXNAESP. 

NAESP is the not-all-equal satisfiability problem with only positive literals 
in the clauses where a clause is satisfied if it has at least one literal set to 0 and at 
least one literal set to 1. NAESP has a solution if every clause in the problem is 
satisfied. MAXNAESP is defined to be the problem of assigning boolean values 
to literals such that the maximum number of clauses are satisfied. MAXNAESP 
is a generalization of MAX-CUT which is evident from the following reduction. 
Given a graph G = (U, E), for every edge e = {u, v) € E we have a clause (u, v) 
in the corresponding MAXNAESP problem. It is easy to see that if there exists 
a cut of size k then there exists a solution to the NAESP problem which satisfies 
at least k clauses. Hence, we can regard MAXNAESP as a generalization of 
MAX-CUT and is equivalent to hypergraph 2-colorability. The equivalence can 
be observed by associating each clause with an element (a subset of S) in set 
G in the instance of hypergraph 2-colorability and the assignment of boolean 
values to literals with the partition of S into and S' 2 . 

The technique of local improvement has been widely used as a heuristic for 
solving combinatorial optimization problems. A novel approximation algorithm 
for MAX-CUT is based on the idea of local improvement [6j. The idea is to start 
with some partition S and V — S and as long as we can improve the quality of the 
solution (the number of cross edges) by adding a single node to S or by deleting 
a single node from S, we do so. It has been shown that this approximation 
algorithm | 2 ] for MAX-CUT has a worst case performance ratio of 

In this paper we describe another local improvement based approximation 
algorithm for solving MAXNAESP which yields a performance ratio of 
where each clause contains at least fc > 3 variables. The algorithm described 
gives a bound of | -I- e for MAX-CUT in contrast to the bound of | achieved by 
the local search algorithm described in |2]. 

NP-Completeness of MAXNAESP follows from the fact that MAX-CUT is 
a special case of MAXNAESP. We give a polynomial reduction from 3NAES, in 
which each clause has exactly three literals, and the literals are not restricted to 
be positive [T], to NAESP thereby establishing the NP-Completeness of NAESP. 
In Section 2, we look at MAXNAESP and provide two approximation algorithms 
for it. 

We begin with some definitions: 

Definition 1 NAES: Given a set of clauses, find a satisfying truth assignment 
such that each clause contains at least one true literal and one false literal. 



Simple Approximation Algorithms 



51 



Definition 2 NAESP: Given a set of clauses, where each clause contains only 
positive literals, find a satisfying truth assignment such that each clause contains 
at least one true literal and one false literal. 



Theorem 1 3NAES 3NAESP. 

Proof: Let n be the number of clauses and m be the number of literals in 3-NAES. 
Let us denote the literals in 3NAES by Xi, Ti, X 2 , X 2 , ■ ■ ■ Xm, xGi- We replace 
each x~i with a new literal x^+i in all the clauses. In addition, for each pair 
of literals Xi, xi, we add the following four clauses {xi,Xm+i,o),{xi,Xm+i,h), 
(xi, Xm+i, c), (a,b,c). We have a total of {n + 4m) clauses in the instance of 
3NAESP so generated. 

If 3NAES is satisfiable then the set of clauses generated in the instance 
of 3NAESP generated is also satisfiable. The solution is obtained by assigning 
Xn+i the same value as Wi in 3NAESP. In addition, the literal a is set to 0 and 
the literals 6,c are set to 1. 

<J= If the clauses in the instance of 3NAESP generated is satisfiable then the 
solution to 3NAES is obtained by setting xi to the same value that Xn+i is set 
to. This assignment clearly satisfies all the clauses. For literals Xi, xi, the four 
clauses (xi, Xm+i, a), {xi, Xm+i, b), (xi, Xm+i, c), and (a, b, c) guarantee that both 
Xi and xi are not set to 1 (or 0). □ 



2 Approximation Algorithms 

In this section we study the optimization version of the NAESP problem called 
the MAXNAESP. The objective in MAXNAESP is to find a solution which ma- 
ximizes the number of clauses satisfied. We first present a simple approximation 
algorithm for MAXNAESP which has a worst-case performance bound of ^ . We 
then examine a locally optimal approximation algorithm for the problem whose 
worst-case performance bound is ^ -I- e for the problem, and in general has a 
bound of -j^^, where fc > 3 is the minimum number of literals in a clause. 

5 Approximation Algorithm: Let the problem comprise m literals and 
n clauses. A literal u is arbitrarily picked and set to 1 if it occurs in at least 
clauses. The remaining literals are set to 0 and the algorithm terminates. If 
however the literal u occurs in only fc < f clauses, then the literal u is set to 
0 and the subproblem comprising the n — k clauses not containing the literal 
u is recursively solved (with the literal u removed from the subproblem) . If the 
solution S to the subproblem satisfies more than | clauses containing u then the 
algorithm returns the solution S for the remaining literals. Else the algorithm 
returns S, the complement of the solution S for the remaining literals (in the 
complement solution S, all the literal settings are complemented) . 

Theorem 2 The above algorithm has a performance ratio of | . 
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Proof: We prove by induction on the number of literals I that the algorithm 
satisfies at least § clauses. 

Base Case: For I = 2 (the minimum number of literals required for NAESP) 
each clause contains both literals (else the clauses cannot be satisfied and can 
therefore be removed) and so the result follows trivially. 

Induction Hypothesis: Assume the algorithm satisfies at least ^ the total number 
of clauses for I < p literals. 

Induction Step: Let the number of clauses be n. Let the set of clauses satisfied 
by literal u be denoted U. The algorithm either sets m to 1 (if u satisfies fc > f 
clauses), or sets w to 0 and solves the subproblem with p — 1 literals (and n — k 
clauses) recursively. By the induction hypothesis, the algorithm derives a solution 
S that satisfies at least iEzAl clauses in the subproblem. If in addition S also 
satisfies at least | clauses among U, then the algorithm satisfies at least ^ in 
total. If on the other hand S satisfies less than | clauses among U, then S satisfies 
at least | clauses among U. This follows from the fact that the clauses that are 
unsatisfied in U due to assignment S have all their variables set to 0, implying 
that the assignment S satisfies all these clauses (because the variable u in all 
these clauses is set to 0) . In addition, both the assignment S and its complement 
S continue to satisfy the same number of clauses in the subproblem. Thus the 
algorithm satisfies at least | clauses in total. □ 

A trivial example where a literal satisfies exactly ^ the total number of clauses 
suffices to show that the bound is tight. 

In the next section we describe an approximation algorithm for MAXNAESP 
and show that the algorithm has a performance ratio of where k is the 
minimum clause length. 

2.1 Approximation Algorithm 

This approximation algorithm relies on the notion of a local optimum. A solution 
S to MAXNAESP, is locally-1 optimal if the number of clauses satisfied cannot 
be increased by complementing the value to which a literal is set in the solution 
S. A solution S to MAXNAESP, is locally-2 optimal if the number of clauses 
satisfied cannot be increased by complementing the value of the literals in a 
satisfied clause of cardinality 2. A solution is said to be locally optimal if it is 
both locally-1 optimal and locally-2 optimal. 

The algorithm starts with a random assignment of values to literals (say all 
literals are set to 0). It then complements the setting of a literal if this improves 
the solution. It then looks at all the satisfied clauses of cardinality 2 and checks 
if complementing both the literals increases the number of satisfied clauses. The 
algorithm continues in this manner until no further improvement results. 

Let S be the locally-optimal solution. With respect to this solution S, we di- 
vide the set of literals into two disjoint subsets, X = {xi,X 2 , ■ ■ ■ , Xp} comprising 
all literals which are set to 1, and Y = {yi,y 2 , ■ ■ ■ ,yq}, comprising all literals 
which are set to 0. We divide the clauses of MAXNAESP into three sets. A, 
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B, and C. A is the set of all satisfied clauses, B (C) is the set of all unsatis- 
fied clauses for which each literal in a clause belongs to X (F). We note that 
each clause in A contains at least one literal that belongs to X, and one literal 
that belongs to Y (else the clause will not be satisfied). Let B{xi)A < * < P 
{C{yj),l < j < q) denote the number of clauses in B (C) which contain the 
literal xt (yj). Let A{x) {A{y)) denote the number of clauses in A that contain 
only one literal from the set X (V). 

Lemma 1 A(x) > B(xi). 

Proof: For the B{xi) clauses in B containing the literal Xi, there should be 
A{xi) > B{xi) clauses in A which contain only literal Xi from the set X (and all 
other literals from the set Y). If this is not the case then by setting Xi to 0 we 
can increase the number of clauses satisfied, violating the fact that S is locally- 
optimal. Noting that a clause in A which contains only literal Xi is distinct from 
a clause in A that contains only literal Xj, i j, ^ Y i,j Y p, it follows that the 
A(xi) clauses in A containing only literal Xi are distinct from the A(xj) clauses 
in A containing only literal xj. Thus, A{x) = B{xi). □ 

The corollary below may be shown using similar reasoning as above. 

Corollary 1 A{y) > C{yj). 

Lemma [2] below derives an upper bound on \B\. 

Lemma 2 \B\ < 

Proof: Each clause bi G B has ki > k literals in it. Counting the number of I’s 
occurring in the clauses in set B, we can write YYi ^i^i) ~ Sl=i The left 
hand side counts the occurrence of I’s in each literal, and the right hand side 

^151 ^IBI 

counts the occurrence of I’s in each clause. But X)i=i — YZ[k=\B\xk, 
from which the result follows. □ 

By a similar reasoning, the corollary below follows. 

Corollary 2 |C| < 

Theorem 3 Let P be an algorithm whieh returns a loeally- optimal solution to 
the MAXNAESP problem. The performanee ratio of P is given by Vp > for 
k > 3 for k > 2 the performance ratio is ^ + e. 

Proof: 

(k > 3) First we show that when the cardinality of each clause is fc > 3 then 
the performance ratio of the algorithm is . It should be noted that 
for this case we obtain the desired bound by considering only loeally-1 
optimal solutions. It follows from the definition that | A| > A{x) + A{y), 
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since A is the set of all clauses that are satisfied, and A(x) (A(y)) is 
the set of clauses that are satisfied with exactly one literal from the 
set X (F). From Lemmas [T] and [2l and Corollaries [T] and [2l it follows 
that \A\ > A{x) + A{y) > YH=\B{xi) + > k x {\B\ + 

IC'D- Hence the performance ratio Xp is given by, Vp = > 

(|B|+|g|)fc > _A_ 

(|B| + |C|)fc+|B| + |C| - fc+1- 

{k > 2) When fc > 2, the number of satisfied clauses A is at most 

hence the previous case is not applicable. Here we use the fact that 
the solution under consideration is locally- 2 optimal. Let X(Y) be the 
set of variables which are set to 1(0) in the locally optimal solution. 
Without loss of generality we assume that |X| > \Y\. Since, the solution 
is locally-2 optimal, for every clause (xi,yj) in the input 

A(xi) + A(yj) > B{xi) + C(%jj) + 2 

else, interchanging the values of Xi and yj would result in more clauses 
being satisfied. Also, A{xi) > B{xi) and A{yj) > C{yj) because the 
solution is locally- 1 optimal. We can conclude that 

Xi^X Xi£X 



2 1 ^ I 2 

i.e., {\B\ |C|) < — . Therefore the performance ratio is 

— > — = - -f e 

|A| + (|H| + |C|) ^ |A| + (|A|-1) 2^ 



where e= 



The following example illustrates that the above bound is tight when k > 2. 
Let the variables be xi,X2, ■ ■ ■ , Xm and ?/i, ?/ 2 , ■ • ■ , 2/m- The first m? clauses are 
(xi,yj) for i = l..m and j = l..m. The next (™) clauses are (xi,Xj) for i = l..m 
and j = 1..TO such that i ^ j. The last (™) clauses are (yi,yj) for i = l..m and 
j = l..m such that i ^ j . A locally optimal solution is all the x's set to I's and 
all the y's set to O's. The ratio in this case is 



"+2*(?) 



3 Conclusion 

As the main result in the paper, we propose a simple locally optimal approxima- 
tion algorithm for the MAXNAESP problem whose performance ratio is 
where /c > 3 is the minimum number of literals in a clause. For the case when 
k > 2 the performance ratio of the algorithm is | -fe. Such an algorithm appears 
to be useful for a large class of problems in the general domain of satisfiability. 
Specifically, we note that this algorithm has a bound of for MAXSAT when 
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fc > 3, and follows directly from Lemmas [T] and |2] This bound is identical to the 
bound derived by Johnson for a simple greedy algorithm [Q. 

We note that MAXNAESP is equivalent (in the sense that the approximation 
ratios are preserved) to the maximization version of hypergraph 2-colorability 
and set splitting. Furthermore, the MAXNAESP problem where each clause has 
exactly 2 literals is equivalent to MAX-CUT. However, a simple adaptation of 
the semi-definite approximation algorithm for solving MAX-CUT for MA- 
XNAESP (where each clause has at least two literals) does not appear to work. 
In addition, a straight-forward extension of the randomized rounding algorithm 
(using the linear-programming solution as the probabilities) |1] yields a bound 
of ^ . Also, the simple randomized algorithm, where each literal is selected with 
probability | has an expected bound of 1 — , where k is the minimum clause 

length. In contrast to this, the algorithm proposed in this paper obtains a bound 
of for the MAXNAESP problem when k > S. For the k > 2 the bound is 
i -|- e. Though the randomized algorithm has a better expected bound for fc > 4, 
the algorithm proposed in this paper obtains a bound of | -I- e (as compared to | 
for the randomized algorithm), for the most general version of the MAXNAESP 
problem (and hyper graph 2-colorability) ^ obtained when fc = 2. 

Acknowledgements: The authors would like to thank an anonymous referee 
for pointing out an error in an earlier version of this paper. 
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Abstract. A graph G = {V, E) is called a circle graph if there is a one- 
to-one correspondence between vertices in V and a set C of chords in a 
circle such that two vertices in V are adjacent if and only if the corre- 
sponding chords in C intersect. A subset V' of U is a dominating set of 
G if for all u £ U either u £ V ox u has a neighbor in V' . In addition, 
if no two vertices in V' are adjacent, then V' is called an independent 
dominating set-, if G[U'] is connected, then V' is called a connected domi- 
nating set. Keil {Discrete Applied Mathematics, 42 (1993), 51-63) shows 
that the minimum dominating set problem and the minimum connected 
dominating set problem are both NP-complete even for circle graphs. He 
leaves open the complexity of the minimum independent dominating set 
problem. In this paper we show that the minimum independent domina- 
ting set problem on circle graphs is NP-complete. Furthermore we show 
that for any e, 0 < e < 1, there does not exist an n®-approximation algo- 
rithm for the minimum independent dominating set problem on n- vertex 
circle graphs, unless P = NP. Several other related domination problems 
on circle graphs are also shown to be as hard to approximate. 



1 Introduction 

For a graph G = {V,E), a subset V of U is a dominating set of G if for all 
u £ V either u £ V ox u has a neighbor in V' . In addition, if no two verti- 
ces in V' are adjacent, then V' is called an independent dominating set-, if the 
subgraph of G induced by V', denoted G\V'\, is connected, then V is called 
a connected dominating set; if G[U'] has no isolated nodes, then V is called 
a total dominating set; and if G\V'] is a clique, then V is called a dominating 
clique. Garey and Johnson m mention that problems of finding a minimum car- 
dinality dominating set (MDS), minimum cardinality independent dominating 
set (MIDS), minimum cardinality connected dominating set (MODS), minimum 
cardinality total dominating set (MTDS), and minimum cardinality dominating 
clique (MDC) are all NP-complete for general graphs. 

Restrictions of these problems to various classes of graphs have been studied 
extensively HU. Table [Ushows the computational complexity of three of the pro- 
blems mentioned above when restricted to different classes of graphs. P indicates 
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the existence of a polynomial time algorithm and NPc indicates that the problem 
is NP-complete. Some of the references mentioned in the table are to original pa- 
pers where the corresponding result first appeared, while some are to secondary 
sources. It should be noted that this list is far from being comprehensive and 
that these problems have been studied for several other classes of graphs. In this 
paper we focus on the only “question mark” in the above table, corresponding 
to MIDS on circle graphs. Not only do we show that MIDS is NP-complete for 
circle graphs, we also show that it is extremely hard to approximate. 



Class of graphs 


MDS 


MIDS 


MCDS 


trees 


p m 


PH 


PE] 


cographs 


p m 


Tunr 


P i 


interval graphs 
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p m 


p El 


permutation graphs 


p m 


p \M 


P El 


cocomparability graphs 


p m 


Tmr 


pm 


bipartite graphs 


NPc |S| 


NPc 


NPc [19J 


comparability graphs 


NPc m 


NPc [6] 


NPc [IS] 


chordal graphs 


NPc 0 


PH 


NPc [E] 


circle graphs 


NPc [Td] 


7 


NPc [Id] 



Fig. 1. Complexity of three domination problems when restricted to different classes 
of graphs. 



A graph G = (V, E) is called a circle graph if there is a one-to-one corre- 
spondence between vertices in V and a set C of chords in a circle such that two 
vertices in V are adjacent if and only if the corresponding chords in C intersect. C 
is called the chord intersection model for G. Equivalently, the vertices of a circle 
graph can be placed in one-to-one correspondence with the elements of a set I 
of intervals such that two vertices are adjacent if and only if the corresponding 
intervals overlap, but neither contains the other. I is called the interval model of 
the corresponding circle graph. Representations of a circle graph as a graph or as 
a set of chords or as a set of intervals are equivalent via polynomial time trans- 
formations. So, without loss of generality, in specifying instances of problems, 
we assume the availability of the representation that is most convenient. 

The complexity of MDS on circle graphs was first mentioned as being un- 
known in Johnson’s NP-completeness column m- Little progress was made 
towards solving this problem until Keil m resolved the complexity of MDS on 
circle graphs by showing that it is NP-complete. In the same paper, Keil sho- 
wed that for circle graphs MCDS is NP-complete, MTDS is NP-complete, and 
MDC has a polynomial algorithm. He left the status of MIDS for circle graphs 
open. In this paper we show that MIDS is also NP-complete on circle graphs. 
We also attack the approximability of MIDS on circle graphs and show that this 
problem is extremely hard to approximate. An a- approximation algorithm for a 
minimization problem is a polynomial time algorithm that guarantees that the 
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ratio of the cost of the solution to the optimal (over all instances of the problem) 
does not exceed a. In Section |2] we show that for any e, 0 < e < 1, there does 
not exist an n'^-approximation algorithm for MIDS on an n-vertex circle graph, 
unless P = NP. In Section [3] we present hardness of approximation results for 
related problems — for example, we show that MIDS on bipartite graphs has no 
n^-approximation algorithm unless P = NP. This implies that MIDS, even when 
restricted to classes of graphs such as circle graphs and bipartite graphs, is as 
hard to approximate as the hardest problems — maximum clique and chromatic 
number, for example [I]. 

2 Intractability of MIDS on Circle Graphs 

A first result in this section shows that the problem of finding a minimum in- 
dependent dominating set on circle graphs is NP-complete. This was one of the 
open questions in m- In a second result we strengthen the NP-completeness 
proof to show that for any e, 0 < e < 1, there does not exist an n^-approximation 
algorithm for MIDS on an n-vertex circle graph, unless P = NP. The decision 
problem formulation of the minimum independent dominating set problem on 
circle graphs (CMIDS) is as follows. 

Minimum Independent Dominating Set for Circle Graphs (CMIDS) 
INPUT: An interval representation of a circle graph G = (V,E) and a positive 
integer k. 

QUESTION: Is there an independent dominating set of G of size at most k? 
Theorem 1. CMIDS is NP-complete. 

Proof: Clearly CMIDS is in NP. The proof that the problem is NP-hard is 
organized in two parts. In the first part we present a polynomial time reduc- 
tion from 3SAT to CMIDS. Let F be an arbitrary instance of 3SAT. Let X = 
{xi, CC 2 , • • • ,Xg} be the set of boolean variables and let C = {Ci, C 2 , • • • , Cm} 
be the set of clauses in F. Let I be the instance of CMIDS produced by the 
reduction from F and let G(J) be the overlap graph of I. In the second part we 
choose an integer k and show that G{I) has an independent dominating set of 
size k or less if and only if F is satisfiable. 

In the following we describe the reduction from 3SAT to CMIDS. Without 
loss of generality we assume that no two literals involving the same variable 
appear more than once in a clause. We consider first each clause Cj separately 
and create six intervals with types a, b, /, t, p and s associated with each literal 
that appears in Cj . Including the type t interval in the independent dominating 
set will correspond to setting the associated literal to true and including the 
type / interval in the dominating set will correspond to setting the associated 
literal to false. 

For each clause Cj we create three pairs of intervals of types u and w. The 
purpose of each such pair is to dominate the type a and type b intervals associated 
with two out of the three literals that appear in Cj. The key idea of our proof 
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is as follows. Suppose that Cj is satisfied by some assignment of truth values to 
variables. Then we can construct an independent dominating set that contains 
a type t interval corresponding to a true literal in Cj and a pair of type u, 
type w intervals associated with Cj. The type t interval dominates one pair of 
type a, type b intervals while the other two pairs of type a, type b intervals are 
dominated by the type u, type w pair. In this manner all the type a and type b 
intervals associated with Cj are dominated. Including the type t interval in our 
independent dominating set will correspond to the clause Cj being satisfied by 
a literal associated to this interval. 

Finally we connect clauses together by tt, //, tf and ft type intervals so as 
to maintain consistency of truth values of literals throughout all clauses. Figure 
12 shows all intervals associated with a two clause and four variable formula in 
which Cl = {xi, X 2 ,xs} and C 2 = {xi,X 2 ,X 4 }. 

In the following we present our construction in detail. Let us consider an 
arbitrary clause Cj. For each variable Xi that appears in Cj we construct two 
independent intervals a* = [yj + 12i, Zj + bi] and 6® = [yj + 121 + 1, Zj + bi— 1], 
where yj = 12{q + 2)j and Zj = yj + 12(g + 1). The purpose of the offset yj is to 
ensure that the intervals associated with different clauses are disjoint. Note that 
since 6* C a®, these two intervals do not overlap. Next we construct truth setting 
intervals t® = [yj + 12i — 1, yj + 12i + 5] and /j = [yj + 12i + 4, yj + 12i + 10] 
associated with the variable Xi. The intervals t® and /j dominate each other, so 
at most one of them can appear in any independent dominating set. Furthermore 
dominates both a] and 6® and /] does not dominate either. The truth setting 
intervals will help us in determining the correspondence between a satisfying 
truth assignment and an independent dominating set of size k or less. 

To ensure that exactly one of the intervals t® and /j associated with a literal 
involving Xi appears in any “small enough” independent dominating set, we 
associate with Xi two other independent intervals p® = [yj + 12i + 3, yj + 12i + 6] 
and s® = [yj + 12i + 2,yj + 12i + 7] adjacent to both t® and /j and to no 
other interval in I. Later we will argue that any “small enough” independent 
dominating set must contain either of the intervals t®- or /j in order to dominate 
Pj and s® . Otherwise we would need to include both p® and s® in our independent 
dominating set and the set would become too large. 

We now construct intervals associated to the clause Cj. The clause Cj is 
satisfied if and only if at least one of the three literals involved in Cj is true. 
This will correspond to at least one of the intervals tj appearing in the inde- 
pendent dominating set. Such an interval will dominate the intervals a® and 
6®. associated to the literal involving Xi. In order to dominate the other four 
aj and bj type intervals associated to the other two literals in Cj we add six 
more intervals of type u and w to I as follows. Suppose that Cj consists of 
three literals corresponding to three variables Xi, and a;;, with i < k < 1. 
We create the intervals rt®* = [zj,Zj -I- 51 -I- 3], w®* = [zj -I- 5fc -|- 1, Zj -|- 5Z -I- 3], 

Uj^ = [zj Zj -\-bk-\-S[, Wj^ = [zj -\-2, Zj bk 2] , = [zj -\-bi 1, Zj bl 2] 

and Wj^ = [zj -I- 5f -I- 2, Zj -I- 5Z -I- 1], where Zj is as defined above. Note that the 
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Fig. 2. The intervals constructed from the two clause and four variable formula (x\ 
X2 V Is) A {x\ V a;2 V X 4 ) 
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two intervals and w** are independent and dominate the remaining 4 inter- 
vals of type Uj and type wj. Furthermore, u‘j and w'j dominate the following 4 
intervals of type aj and bj: a*-, Uj, and 6^ . The interval pair and and 
the interval pair Uj’’ and have similar properties. 

If we repeat the construction described above for each clause Cj, 1 < j < m, 
we obtain m pairwise disjoint subsets of intervals, each subset associated to one 
clause. A last piece of our construction is a gadget that forces a variable to have 
the same truth value throughout all clauses. For this we create intervals of type 

//) f/ ft that connect clauses together as follows. For any j and k, 
1 < j < k < m, if Xi appears in two clauses Cj and Ck or if Xi appears in both 
Cj and Ck we create two intervals tp-f. and ff-f, such that overlaps f* and 
fl and overlaps /j and t].. Furthermore, if Xi appears in Cj and Xi appears 
in Ck or if Xi appears in Cj and Xi appears in Ck, we create two intervals ttjf. 
and ffji^ such that ttp overlaps p and t\. and //j^, overlaps /] and /^. We now 
specify the intervals of type tt, //, tf, and ft. If Xi appears in both Cj and Ck 
or Xi appears in both Cj and Ck, then we add = [yj + 12i + l,yk + I2i + 8], 
ftjk = [Vj + 121 + 9,yk + 12t] to the set I. If Xi appears in one of Cj or Ck 
and Xi appears in the other, then we add = [yj + 12i + l,yk + 12i], ffjk = 
[yj + 12i + 9,yk + 12i + 8] to the set I. 

This completes the construction of I. We now set k = 5m and prove the 
following lemma, which completes the proof of the theorem. 

Lemma 1. G has an independent dominating set of size k if and only if F is 
satisfiable. 

Proof. Given a satisfying truth assignment A for F we show how to construct a 
set D of size k of independent intervals that dominate all intervals in / — H. If 
Xi is true in A we include in D, fj for each clause Cj in which Xi appears and /j 
for each clause Cj in which Xi appears. If Xi is false in A we include in D, fj for 
each clause Cj in which Xi appears and t* for each clause in which Xi appears. 
This ensures that all the p* and the s* type intervals have been dominated. Note 
that so far 3m intervals have been included in D. 

We now show that all the tp, ft, W and //* type intervals are also do- 
minated by the intervals included in D so far. Consider an interval tf^k- The 
presence of this interval in / implies one of four possibilities: (a) Xi is true and 
Xi appears in both Cj and Ck, (b) Xi is true and Xi appears in both Cj and 
Ck, (c) Xi is false and Xi appears in both Cj and Ck, and (d) Xi is false and Xi 
appears in both Cj and Ck. If Xi is true and it appears in both Cj and Ck, then 
tj dominates Otherwise if Xi is true and Xi appears in both Cj and Ck, 
then fl dominates If Xi is false and it appears in both Cj and Ck, then fl 
dominates Otherwise if Xi is false and Xi appears in both Cj and Cf, then 
tj dominates So tfjk is dominated by one of the intervals already included 
in D in each of the four possible cases. In a similar manner it can be shown that 
the intervals of type ft, tt, and //* are all dominated by the intervals already 
included in D. 
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Let us consider a clause Cj that contains three literals corresponding to 
variables Xt, Xk and xi, where i < k <1. Since A is a satisfying truth assignment, 
at least one of the literals involving Xj, Xk and xi must be true. Let us assume 
without loss of generality that the literal involving Xi is true and so we have 
included t* in D. In this case we add and to D, thus ensuring that all 
the Oj, bj, Uj and Wj type intervals have been dominated. So we have added 
two more intervals associated with the clause Cj to D. Repeating this for each 
clause will add 2m intervals to the independent set and will also ensure that all 
intervals in G are dominated. Thus we have an independent dominating set of 
size 5m as required. 

We now establish the other direction of Lemma E if there is a dominating 
independent set D C / of size k, then there is a truth assignment to the variables 
in X that satisfies F. Recall that associated with each variable Xi that appears 
in a clause Cj there is a pair of intervals: s* and p*. Since no interval in I is 
adjacent to two p type intervals or two s type intervals and since only and 
fj are adjacent to p® and s®, D must contain, for each i and j, either t® or /j 
or both s® and p®. If for each i and j, D contains t® or /j (but not both, since 
fj and fj are adjacent to each other), then exactly these 3m type t and type / 
intervals in D suffice to dominate all the type p and type s intervals. This leaves 
exactly 2m intervals to dominate the rest of intervals in I. If for some i and j, D 
contains p® and s® then more than 3m intervals in D are necessary to dominate 
the type p and type s intervals. This leaves fewer than 2m intervals to dominate 
the rest of the intervals in I. 

There is no interval in I adjacent to u type intervals or w type intervals 
associated with different clauses. In other words, for any j f k, no interval in I 
is adjacent to two intervals of types Uj and Wk, or Uj and Uk, or Wj and Wk- Thus 
the set of intervals in D which dominate the u and w type intervals associated 
with different clauses are pairwise disjoint. Also no interval in I is adjacent to 
all the six Uj and Wj type intervals associated with a clause Cj. Therefore D 
must contain at least two intervals per clause in order to dominate all the u 
and w type intervals. Combined with the argument in the previous paragraph, 
this means exactly 3m intervals in D dominate the type p and type s intervals 
and the remaining 2m intervals dominate the type u and type w intervals. This 
further implies that for each variable Xi appearing in a clause Cj , either t® or /j 
(but, not both) belong to I. 

We first show that for each clause Cj, D contains at least one type tj interval, 
one type Uj interval, and one type Wj interval. Suppose that the three variables 
that appear in Cj are Xi, Xk, and xi, where i < k < 1. As shown before, D 
must contain two intervals which dominate all the Uj and Wj type intervals. The 
only intervals adjacent to a Uj or a Wj type interval are the aj, bj, Uj and Wj 
type intervals. No two Uj type intervals or two bj type intervals can occur in D, 
since they are not independent. Suppose that D contains an interval a®. This 
interval is adjacent to at most four out of the six intervals of type Uj or Vj. This 
leaves at least one type Uj, at least one type Wj interval, and 5® undominated. 
It is easy to see that independent of which type tj and which type fj intervals 
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are included in D, it is not possible for one interval to dominate all of these 
undominated intervals. Thus we have shown that D cannot contain a type aj 
interval ( since the above argument can be used to show that D cannot contain 
a'j or Qj as well). Likewise it can be shown that D does not contain any of the 
bj type intervals. Hence D must contain two independent intervals of type Uj or 
Wj in order to dominate all the Uj and Wj type intervals. ^From the structure of 
I it can be seen that such two intervals must be of the form and . Besides 
dominating all the Uj and Wj type intervals, any pair of intervals also 

dominates a®, 5®, a'j and bj. The other two intervals of type Qj and bj associated 
with the clause Cj must be dominated by a type t interval. 

The type t and type / intervals have thus accounted for 3m of the intervals in 
D and the two type u and type w intervals per clause account for the remaining 
2m intervals in D. Since all 5m intervals in D have been accounted for, we have 
to show that all intervals in I — D are dominated. The above argument has 
established that all the type p, type s, type t, type /, type a, type b, type u, 
and type w intervals have been dominated. This only leaves the type ft, type t/, 
type //, and type ft intervals. In the following we observe that these intervals 
are dominated if the truth values included in D are “consistent” across variables 
and clauses. 

(a) if Xi (or Xi) appears in two different clauses Cj and Ck and D contains t® 
(respectively, fj), then D must also contain (respectively, fl) in order to 
dominate /t®^ (respectively, 

(b) if Xi appears in Cj, Xi appears in Ck and D contains t® (respectively, fj), 
then D must also contain fl (respectively, t}.) in order to dominate //jj, 
(respectively, tf'jk)- 

Now we can construct a satisfying truth assignment A for F as follows. If D 
contains a type t® interval associated to a literal involving Xi, then we set this 
literal to true in A. Similarly, if D contains a type /® interval associated with a 
literal involving Xi, then we set this literal to false in A. As mentioned above, 
D must contain those type t and type / intervals that correspond to a literal 
having the same truth value throughout all clauses. Since for any clause Cj there 
exists i such that t® occurs in D, we conclude that any clause Cj contains a true 
literal and therefore A satisfies F as required. 

In the next result, we strengthen the above NP-completeness proof to show that 
the CMIDS problem is extremely hard even to approximate. 

Theorem 2. For any e, 0 < e < 1, CMIDS does not have an -approximation 
algorithm, unless P = NP. Here n is the number of intervals in the input to 
CMIDS. 

Proof. The proof of the theorem is organized in three parts. In the first part, 
we present a reduction from 3SAT to CMIDS similar to the one described in 
Theorem [TJ Let F be an arbitrary instance of 3SAT with X = {xi,X 2 , - ■ ■ ,Xq} 
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as the set of boolean variables and C = {Ci, C 2 , • • • , Cm} as the set of clauses. 
Let J be the instance of CMIDS produced by the reduction from F and let G{J) 
be the overlap graph of J. The reduction is parameterized by a positive integer 
a and the size of J depends on the size of F and on a. 

In the second part we show that the reduction has the following two pro- 
perties. (i) if F is satisfiable, then a minimum independent dominating set in 
G( J) has size 5m or less and (ii) if F is not satisfiable, then a minimum inde- 
pendent dominating set in G{J) has size greater than 5am 

In the third part, we show that for any e, 0 < e < 1, if a < | J|^, then | J| is a 
polynomial function of m. This ensures that the reduction described in the first 
part of the proof takes polynomial time. 

These three pieces together prove the theorem. To see this suppose that for 
some e, 0 < e < 1, we have an n^-approximation algorithm A for CMIDS. Then 
we could use A to solve an arbitrary instance F of 3SAT in polynomial time as 
follows. Part 3 of the proof implies that given F, we construct J in polynomial 
time. We then apply A to G{J). If F is satisfiable, then the size of a minimum 
independent dominating set in G{J) is 5m or less and therefore the size of the 
independent dominating set found by A is brFm or less. If F is not satisfiable, 
then the size of a minimum independent dominating set in G{J) is greater than 
brFm and therefore the size of the independent dominating set found by A is 
greater than brFm. Hence, comparing the size of the set found by A with the 
value brfm resolves the satisfiability of F in polynomial time. 

In the following we describe the reduction from 3SAT to CMIDS. The set of 
intervals J is a simple extension of the set I used in the proof of Theorem [I] and 
we only sketch the modifications here. The reduction is depicted in Figure |2] 
For an interval a, let l{a) denote the left endpoint of a and r(a) denote the right 
endpoint of a. Consider an arbitrary clause Cj. For each variable Xi that appears 
in Cj the set I contains two independent intervals a* and 6®. Recall that these 
two intervals overlap exactly the same set of intervals. Corresponding to these 
two intervals in I we add r = 5am independent intervals, named a^, af ■ ■ ■ a" 
to J such that (i) a]* = a®, a" = and (ii) C aj* for Z = 1 • • • r — 1. This 
ensures that the intervals aj* for all I overlap exactly the same set of intervals. 
Let us call a]* and a” originals and for all 1 < £ < r copies. For each 
variable Xi that appears in Cj, I also contains four more intervals t®, /j, s® and 
Pj . We include in J the truth setting intervals t® and /j as they appear in the set 
/. Corresponding to s® and p® in / we add to J, r independent intervals sj®, s^® 

• • • s" such that (i) sj® = s® , s" = p® and (ii) C s^® for Z = 1 • • • r — 1. This 
ensures that the intervals sj® for all Z, overlap exactly the same set of intervals. 
As before, let us call sj® and s" originals and sf for all £, 1 < £ < r copies. The 
set I also contains intervals of type Uj and Wj associated with each clause Cj. 
We add these to J as they appear in I. 

Finally, recall that in the construction used in Theorem [U clauses that con- 
tain common variables are connected together by tt, //, tf and ft type intervals 
in /. Corresponding to each of these intervals in / we add r new intervals to J 
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as follows. Without loss of generality we consider intervals of type tt. The con- 
struction is identical for intervals of type //, tf, and ft. Suppose that I contains 
the interval ft* j,. To J we add r independent intervals named ffj^, ■ ■ ■ , ttYj. 
such that (i) ffj^ = ff*^, (ii) ff^’^^* C and (iii) the endpoints of the intervals 
are chosen close enough so that for all I, the intervals ff^^, overlap exactly 
the same set of intervals. As before, we call ffj^ an original and the rest of the 
type tt intervals copies. We use a similar nomenclature for the type ft, tf, and 
// intervals. This completes the construction of J. We are now ready to prove 
the following lemma. 

Lemma 2. If F is satisfiable, then a minimum independent dominating set in 
G{J) has size 5m or less. If F is not satisfiable, then a minimum independent 
dominating set in G{J) has size greater than bam. 

Proof. Suppose that F is satisfiable. As shown in the proof of Theorem [T] we 
can construct an independent set D of size 5m of type t, type f, type u and type 
w intervals that dominates all intervals in I — D. These intervals in D appear in 
J as well. Clearly, every interval in J — D that appears in / — I? is dominated 
by some interval in D. The intervals in J — D that do not appear in I — D are 
copies. Since all the originals are dominated and the copies have exactly the same 
neighbors as the originals, and since none of the originals are in D, it follows 
that the copies are also dominated. This ensures that all intervals in J — D are 
dominated by intervals in D. 

Now suppose that F is not satisfiable. Let D be a smallest set of independent 
intervals in J that dominates all intervals in J — D. Suppose that D contains 
an interval of type a - say a*. Since intervals in D are independent, D cannot 
contain any interval that overlaps with a*. Since the original and all the copies 
of type a have exactly the same set of neighbors, to be dominated, all intervals 
of type a must be in Z?. A similar argument can be made for intervals of type tt, 
ft, tf, ff, and s. This means that if D contains any of these intervals, then its 
size is at least r = bam. Now suppose that D does not contain an interval of the 
following types: a, tt, tf, ft, ff, or s. The absence of the type s intervals in D 
implies that for each appropriate i, j pair, D contains t* or /j. Thus we have an 
assignment of truth values to all the literals. The absence of type tt, ff, tf, and 
ft intervals implies that these truth values are consistent with each other. This 
implies that for some clause Cj and for all relevant i, /j G D. This leaves all the 
type aj intervals undominated and these cannot be dominated by the inclusion 
of the type u and type w intervals in D. This implies that at least one type Oj 
interval has to be included in D. But, if one type aj interval is included in D, 
then all r = bam type aj intervals have to be included in D. This implies that 
if F is not satisfiable, any dominating set of J contains at least bam intervals. 

To complete the proof of this theorem, it remains to show that the trans- 
formation from F to J takes time that is polynomial in the size of F when 
a < I J|^. The following lemma states this claim formally. The proof is omitted 
due to space constraints and mainly consists of showing that for all a < | J|®, 
the size of J is bounded above by a polynomial in m. 
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Lemma 3. For any e, 0 < e < 1, and any a < | J|®, |J| is hounded above by a 
polynomial function of m. 

It is easy to see that the reduction described at the beginning of the proof 
takes time proportional to | J|. By showing that when a < | J|^, | J| is bounded 
above by a polynomial function of m, we have established that the reduction is 
a polynomial time reduction. 



3 Related Hardness of Approximation Resnlts 

Let BMIDS be the minimum independent dominating set problem restricted to 
bipartite graphs. Irving |12| shows that for any fixed a > 1, no polynomial time 
a-approximation algorithm exists for BMIDS unless P=NP. We strengthen this 
result to show that BMIDS is not approximable within a factor of for any 
0 < e < 1, unless P = NP. Proofs of theorems in this section are omitted due to 
space constraints. 

Theorem 3. For any £, 0 < £ < 1, BMIDS does not have an -approximation 
alqorithm unless P = NP. Here n is the number of intervals in the input to 
BMIDS. 

We now determine the complexity of the “bichromatic” minimum indepen- 
dent dominating set problem and show that this problem is also hard to approxi- 
mate. This problem is motivated by problems in polygon decomposition (see (3 
for details). Let TRUE, INDEP, CONNECT, TOTAL, and CLIQUE refer to properties of 
a subset of vertices of a graph. In particular, for any graph G = (V, E), any sub- 
set of V has the property TRUE, any independent subset of V has the property 
INDEP, any subset of V that induces a connected subgraph has the property 
CONNECT, any subset of V that induces a subgraph with no isolated vertices has 
the property TOTAL, and any subset of V that induces a clique has the pro- 
perty CLIQUE. Letting tt G {TRUE, INDEP, CONNECT, TOTAL, CLIQUE} we have the 
following classes of problems. 

Minimum 7r-Dominating Set on Circle Graphs (tt-CMDS) 

INPUT : An interval representation of a circle graph G = (V, E) and a natural 
number k. 

QUESTION: Does G have a dominating set of size no greater than k, having 
property tt? 

Minimum 7r-Dominating Set on Bichromatic Circle Graphs (tt-BCMDS) 
INPUT; An interval representation of a circle graph G = (V,E), each of whose 
vertices is coloured RED or BLUE, and a natural number k. 

QUESTION: Does G have a RED set of vertices R of size no greater than k such 
that (i) R has property tt and (ii) all the BLUE vertices are dominated by R. 

Theorem 4. For any tt G {TRUE, INDEP, CONNECT, TOTAL, CLIQUE} there is a po- 
lynomial time reduction from tt-CMDS to tt-BCMDS. 
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Corollary 1. tt-BCMDS is NP-complete for tt € {TRUE, INDEP, CONNECT, TOTAL}. 
Furthermore, for tt = INDEP, there exists no -approximation algorithm for tt- 
BCMDS on n-vertex hichromatic circle graphs, unless P = NP. 

4 Conclusions 

The results in our paper show that independent domination problems on circle 
graphs are harder than expected. The topic of approximation algorithms for 
dominating set problems on restricted classes of graphs has not received much 
attention from researchers as yet. Our result in this paper — MIDS on circle gra- 
phs is extremely hard to approximate — along with the results on the existence 
of polynomial time constant-factor approximation algorithms for MDS, MTDS 
and MODS problems (Damian-Iordache and Pemmaraju, in this proceedings), 
are probably the first of this kind. 

Acknowledgement: We thank J. Mark Keil for sharing with us preliminary ideas 
on the NP-completeness proof of CMIDS 
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Abstract. A graph G = {V, E) is called a circle graph if there is a one- 
to-one correspondence between vertices in V and a set C of chords in a 
circle such that two vertices in V are adjacent if and only if the corre- 
sponding chords in C intersect. A subset V' of U is a dominating set of 
G if for all u G U either u £ V' or u has a neighbor in V' . In addition, if 
G[U'] is connected, then V' is called a connected dominating set, if G[U'] 
has no isolated vertices, then V' is called a total dominating set. Keil (Di- 
screte Applied Mathematics, 42 (1993), 51-63) shows that the minimum 
dominating set problem (MDS), the minimum connected dominating set 
problem (MCDS) and the minimum total domination problem (MTDS) 
are all NP-complete even for circle graphs. He mentions designing ap- 
proximation algorithms for these problems as being open. This paper 
presents 0(l)-approximation algorithms for all three problems — MDS, 
MCDS, and MTDS on circle graphs. For any circle graph with n vertices 
and m edges, these algorithms take 0(n^ -\- nm) time and O(n^) space. 
These results, along with the result on the hardness of approximating mi- 
nimum independent dominating set on circle graphs (Damian-Iordache 
and Pemmaraju, in this proceedings) advance our understanding of do- 
mination problems on circle graphs significantly. 



1 Introduction 

A graph G = (V, E) is called a circle graph if there is a one-to-one correspondence 
between vertices in V and a set C of chords in a circle such that two vertices in V 
are adjacent if and only if the corresponding chords in C intersect. C is called the 
chord intersection model for G. Equivalently, the vertices of a circle graph can be 
placed in one-to-one correspondence with the elements of a set I of intervals such 
that two vertices are adjacent if and only if the corresponding intervals overlap, 
but neither contains the other. / is called the interval model of the corresponding 
circle graph. Representations of a circle graph as a graph or as a set of chords or 
as a set of intervals are equivalent via linear time transformations. So, without 
loss of generality, in specifying instances of problems, we assume the availability 
of the representation that is most convenient. 

For a graph G = (V,E), a subset V of U is a dominating set of G if for 
all It £ U either u £ V or u has a neighbor in V' . In addition, if no two 
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vertices in V are adjacent, then V is called an independent dominating set] if 
the subgraph of G induced by V ^ denoted G[y'], is connected, then V is called a 
connected dominating set; if has no isolated nodes, then V is called a total 

dominating set] and if G[l^'] is a clique, then V' is called a dominating clique. 

Garey and Johnson |4] mention that problems of finding a minimum car- 
dinality dominating set (MDS), minimum cardinality independent dominating 
set (MIDS), minimum cardinality connected dominating set (MODS), minimum 
cardinality total dominating set (MTDS), and minimum cardinality dominating 
clique (MDC) are all NP-complete for general graphs. Johnson [2] seems to be 
the first to identify MDS as an open problem for circle graphs. Little progress 
was made towards solving this problem until Elmallah, Stewart and Culberson 
defined the class of k-polygon graphs [1]. These are the intersection graphs of 
straight-line chords inside a convex /c-sided polygon. Permutation graphs form 
a proper subset of the class of 3-polygon graphs and the class of circle graphs 
is the (infinite) union of the classes of fc-polygon graphs for all fc > 3. For fixed 
k, Elmallah et al. were able to provide a polynomial time algorithm for MDS. 
Finally, Keil |3] resolved the complexity of MDS on circle graphs by showing that 
it is NP-complete. In the same paper, he also showed that MODS and MTDS 
are also NP-complete for circle graphs. In that paper, Keil does not investigate 
the question of approximation algorithms, however he mentions the construction 
of approximation algorithms for MDS, MODS, and MTDS as being open. 

An a- approximation algorithm for a minimization problem is a polynomial 
time algorithm that guarantees that the ratio of the cost of the solution to the 
optimal (over all instances of the problem) does not exceed a. In this paper we 
present an 8-approximation algorithm for MDS, a 14-approximation algorithm 
for MODS, and a 10-approximation algorithm for MTDS on circle graphs. Our 
algorithms use O(n^) space and run in 0(n^ + nm) time, where n is the number 
of vertices and m is the number of edges in the input circle graph. The algo- 
rithms for MODS and MTDS are obtained by making simple modifications to 
the algorithm for MDS. 

2 Approximating Dominating Set Problems on Circle 
Graphs 

The minimum dominating set problem on circle graphs (CMDS) is as follows. 

Minimum Dominating Set for Circle Graphs (CMDS) 

INPUT: A set T of chords in a circle. 

OUTPUT: A smallest dominating set U CT. 

Keil shows that CMDS is NP-complete |3j. In the following we present an ap- 
proximation algorithm for CMDS that produces a dominating set of size within 
a factor of 8 of optimal. 

Let n = |T| and let m be the number of pairwise intersections of chords in 
T. In other words, n and m are the number of vertices and number of edges, 
respectively of the corresponding circle graph. Let 1, 2, . . . , 2n be the sequence of 
endpoints of chords in T listed in counterclockwise order, starting at an arbitrary 
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endpoint. Without loss of generality, we assume that the endpoints of the chords 
are all distinct. For any chord c S T, the endpoint with smaller (respectively, 
larger) label is called its left endpoint (respectively right endpoint), denoted l{c) 
(respectively r(c)). For a set of chords C C T, let E[C] denote the set of all 
endpoints of chords in C. For any points i,j G E[T] let sector(i,j) denote the 
arc of the circle obtained by starting from i and moving in counterclockwise 
direction until j is reached. We assume that sector(*,j) is an open set, that 
is, it does not include its endpoints. Occasionally, we will need sectors that 
contain their endpoints; we will use closure(sector(i, j)) to denote a sector 
that contains its endpoints i and j. For any sector s, we call the endpoint with 
smaller (respectively, larger) label, the left endpoint (respectively, right endpoint) 
of s, denoted l{s) (respectively, r(s)). A set of sectors Si, S2, • • • Sm is a chain if 
r(si) = l(si+i), i = 1 ■■■ m — 1. Whenever we talk about a set of sectors, it is 
understood that the sectors are pairwise disjoint. For any chain C let 1(C) denote 
the left endpoint of the first sector in C and r(C) denote the right endpoint of 
the last sector in C. 

Definition: We say that sector(*,j) is a leaf sector if and only if (i) there is 
no chord c in T with i < 1(c) < r(c) < j and (ii) any chord incident on a point 
in sector(i, j) is cut by a chord with both endpoints outside sector(i, j). 

Let LS denote the set of all leaf sectors. Note that the definition of a leaf sec- 
tor is with respect to the given set of chords T and therefore it may be more 
appropriate to denote the set of leaf sectors by LS(T). Since T is fixed, for con- 
venience we will simply use LS. A chain of leaf sectors C is a leaf sector cover 
if sector(r(C), /(C)) is also a leaf sector. The reason for defining a leaf sector 
cover will become clear from the following proposition, which is illustrated in 
Figure [H 

Proposition 1. Let D C T be a dominating set of T of size d. Suppose that 
ii < 12 < . . . < i 2 d are the points in E[D], Let sj = sector(ij, i^+i), for all j, 
I < j < 2d. Then si,S 2 , ■ ■ . , S 2 d-i is a leaf sector cover. 

Thus all dominating sets of T, including a minimum dominating set, induce a 
leaf sector cover that is roughly twice the size of the dominating set. Our goal is 
to compute an optimal leaf sector cover and then use it to compute a dominating 
set. 



2.1 Computing an Optimal Leaf Sector Cover 

For any pair of points i,j G E\T], i < j, we define OC(i,j) to be a smallest chain 
C of leaf sectors with 1(C) = i and r(C) = j. The following lemma establishes 
the optimal substructure property of OC(i,j). Using this property, it is easy to 
compute OC(i,j) using dynamic programming, given the set of leaf sectors LS. 

Lemma 1. For alli,j G E\T], i < j, OC(i,j) satisfies the following recurrence: 

nr<(' ■'i — / sector(z, j) if sector(i,j) G LS 

' ^ mini<fc<j{OC(z, fc) © sector(/c, j) | sector(fc,_)) G US'} otherwise 
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Fig. 1. si, S2, . . . , S2d-i is a leaf sector cover 



Here the © operator stands for concatenation and the min operator returns the 
chain with minimum cardinality. 

Proof. The proof is by induction on (j — i). The base case is when j — i = 1 
and the recurrence relation is trivially true. The inductive hypothesis is that 
for some natural number s and for all i, j, such that 1 < j — i < s, the above 
recurrence relation is true. To prove the inductive step we consider the leaf chain 
OC{i,j) for some i,j such that j — i = s + 1. Let sector(g, j) be the last sector 
in OC{i.,j). In the recurrence above, the variable k takes on the value q also, 
therefore the size of the right hand side is no greater than that of the left hand 
side. We now show that the right hand side is no less than the left hand side. 
Suppose there existed a k such that 

\OC{i,j)\ > \OC{i, k) © sector(fc, j)|. (1) 

But C = OC{i, k) © sector(fc, j) is a chain of leaf sectors with the property that 
1{C) = i and r{C) = j. By definition, this makes C a candidate for OC{i,j), 
therefore \OC{i,j)\ < \C\. This contradicts inequality ([T|) and we are done. 

It is obvious that OC(l,2n) is a leaf sector cover since sector(2n, 1) is a 
leaf sector. The following proposition claims that this leaf sector cover is good 
enough for our purposes. 

Proposition 2. Let D* be a minimum dominating set ofT. If\D*\ = d* , then 
\OC{l,2n)\ < 2d* +2. 

So our goal is to compute OC(l, 2n). As mentioned earlier, given the set of leaf 
sectors LS”, we can compute 0(7(1, 2n) using dynamic programming. To compute 
00(1, 2n) using the recurrence of Lemma H] we need to compute 00(1, j), for 
all j, 1 < j < 2n. Computing 00(1, j) given 00(1, k), for all fc, 1 < fc < j takes 
0(j) time for a total of 0{j) = O(n^) time to compute 00(1, 2n). In the 
next subsection we show how to compute the set LS in 0(nm) time. 
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2.2 Identifying Leaf Sectors 

For convenience we assume that the set LS is stored in a 2n x 2n boolean 
matrix called leaf sector. For any i and j, 1 < * < j < 2n, leaf sector [i, j] 
is 1 if and only if sector(i,j) is a leaf sector. Since we are only interested in 
sector(i,j) for i < j, only the strict upper triangular portion of the matrix is 
relevant. Furthermore, if a sector sector(i,j) is not a leaf sector, then sectors 
sector(i, k) for all k, j < k < 2n are not leaf sectors and similarly sectors 
sector(fc,j) for all k, 1 < k < i are not leaf sectors. This implies that if a 
certain entry leafsector[i,_;/] = 0 then all the entries in the submatrix to the 
northeast of entry [i,j] are 0. This observation allows us to define an order 
for processing the sectors efficiently. We start with the leaf sector sector(l,2). 
Suppose that at a certain stage we have determined whether sector(i,j) is a 
leaf sector. The next sector to process can be chosen as follows depending on 
whether sector(i, j) is a leaf sector. 

— Suppose that sector(j,j) is a leaf sector. We are still looking for the first 
sector in row i which is not a leaf sector. So we then process sector(i, j + 1). 

— Suppose that sector(i, j) is not a leaf sector. Then we know that for all 
k > j, sector(j, k) is not a leaf sector and for all k < j, sector(z, k) is a leaf 
sector. We know the latter because if sector(z, k) was not a leaf sector for 
some k < j, we would have stopped processing the sectors in row i earlier, 
that is, at column k. So we process sector(i + 1, j) next. 

For any i G E[T], let T[i] denote the chord in T incident on i. We will show 
that having determined that sector(i,j) is a leaf sector, we can determine if 
sector(«, j + 1) is a leaf sector in 0(degree(T[j])) time. Similarly, having deter- 
mined that sector(i, j) is not a leaf sector, we can determine if sector(i -|- 1, j) 
is a leaf sector in 0(degree(T[j — 1])) time. It is easy to see that only 0{n) sectors 
are considered for processing. From these observations, it follows that the total 
work done is 0{nm). 

In the following we describe how to compute the value of leaf sector [i, j -|- 1] 
or leafsector[z -|- 1, j], having computed the value of leafsector[*, j]. First 
some definitions. For a given chord c G T and a point *,!<*< 2n, let 
closestpoint(i, c) denote the endpoint of c first encountered in a counterclock- 
wise walk starting from b If c = (i, j) is a chord in T, then closestpoint(i, c) = i 
and closestpoint(j, c) = j. Define the distance between a point i G E[T] and a 
chord cGT, denoted distance(i, c) as the number of points in E[T] encounte- 
red in the counterclockwise walk starting at i and ending at closestpoint(i, c). 
In computing distance(z, c) we include i but not closestpoint(i, c). Thus 
distance(i, c) measures the number of “hops” to get to the closer endpoint 
of c from i traveling in counterclockwise order. For any two points i and j, 
1 < * < J < 2n, define f arthestcut(i, j) as the chord cGT that cuts T[j] and 
maximizes distance(i, c). Some of these definitions are illustrated in Figure 
The following proposition is immediate from the definition of a leaf sector. 

Proposition 3. A sector sector(z,j) is a leaf sector iff for all k, i < k < j, 
f arthestcut(i, /c) has both endpoints outside sector(i,j). 
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Fig. 2. An example illustrating the closestpoint and farthestcut definitions 



We use the above proposition to determine if a sector is a leaf sector. Sup- 
pose that we have determined that sector(z,j) is a leaf sector, assigned 1 
to leaf sector[z, j] and in the process computed f arthestcut(z, /c) for all k, 
i < k < j. To determine the status of sector(z, j H- 1) we need to do the follo- 
wing: 

1. Check if T\j] = f arthestcut(z, fc) for any k, i < k < j . If this is the case, 
then sector(z, j + l) is not a leaf sector. To check this condition examine each 
chord c that cuts T[j] and check if c is incident on k for some k, i < k < j. 
If so, check f arthestcut(z, fc) and check if f arthestcut(z, /c) = T[j], This 
takes 0(degree(r[j])). 

2. Check if farthestcut(z, j) lies outside sector(z, j). It is easy to see that 
farthestcut(z, j) can be computed in 0((degree(T[j])) time. 

Suppose now that we have determined that sector(z, j) is not a leaf sector, 
assigned 0 to leafsector[z, j] and in the process computed f arthestcut(z, /c) 
for all A:, z < fc < j. We now want to determine the status of sector(z -|- 1, j). 
In this case we know that sector(z,j — 1) is a leaf sector. This implies that 
sector(z-|-l, j — 1) is also a leaf sector. It is easy to see that if sector(z-|- 1, j — I) 
is a leaf sector, then farthestcut(z-|- 1, fc) = f arthestcut(z, fc) for all fc, z-l-1 < 
k < j — 1. Using the fact that sector(z + l,j — 1) is a leaf sector and using 
information associated with it, we can determine if sector(z -|- l,j) is a leaf 
sector, just as described above in time O (degree (T[j — 1])). 

This completes our description of how LS is computed. Note that we have 
assumed that LS is stored in the leaf sector matrix, only for convenience. 
Ignoring the lower triangular portion of the matrix, we see that each row contains 
a contiguous sequence of I’s followed by a contiguous sequence of O’s. Thus we 
need to only store the index of the last 1 in each row. This takes 0{n) space (as 
opposed to 0{n?) space for the matrix) and provides us with an 0(1) time test 
for determining if sector(z, j) is a leaf sector. 
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We now show how OC(l,2n) can be used to compute a small dominating 
set. For this we make use of a “structural” property of leaf sectors proved in the 
next subsection. 



2.3 A Structural Property of Leaf Sectors 

Two sectors Si and Sj are said to be connected in T if there is a chord c G T with 
one endpoint in Si and the other endpoint in Sj. In this case, c is said to connect 
sectors st and sj . Let Tij be the set of chords that connect sectors Si and Sj . 

Lemma 2. Let Si and Sj be two leaf sectors which are connected in T. Then Tij 
can he cut with at most two chords in T. 

Proof. Assume without loss of generality that r{si) < l{sj). By the definition 
of a leaf sector, any chord incident on a point in Si is cut by a chord with both 
endpoints outside Si (and similarly for sj). Assume first that there exists a chord 
c G T with one endpoint in qi = closure(sector(r(si), ^Sj))) and the other 
endpoint in q 2 = closure(sector(r(sj), l(si))). Then c cuts all chords in Tij and 
the claim is true. So assume that there is no chord in T with one endpoint in qi 
and the other endpoint in q 2 . Let U G T he the set of chords with one endpoint 
in Si and the other endpoint in qi or q 2 - Since Sj is a leaf chord, U ^ 1. and cuts 
all chords in Tij (which are incident on points in Sj). We greedily pick a chord 
c in {7 that cuts most of the chords in Tij. Assume without loss of generality 
that c has one endpoint in q\. If c cuts all chords in Tij, then the claim is true; 
otherwise let B C T^- be the set of chords in Tij not cut by c. Let t be the chord 
in B with the endpoint l{t) closest to l{c). Refer to Figured Consider a chord 
d G U that cuts t. li d has one endpoint in qi, then d cuts all chords cut by c 
plus t, contradicting the fact that c cuts most of the chords in Tij. Hence d must 
have one endpoint in q 2 . Since t is the chord with the rightmost left endpoint in 
B, it follows that d cuts all chords in B. Thus c and d cut all chords in Tij. 

For any pair of leaf sectors Si and Sj, define cutset(si, s^) as a smallest set of 
chords satisfying the following properties: (i) every chord connecting Si and Sj is 
cut by some chord in cutset(si, Sj) and (ii) no chord in cutset(si, Sj) connects 
Si and Sj. The above lemma tells us that cutset(si, Sj) is well defined for any 
pair of leaf sectors Si and Sj and has size at most 2. The proof of Lemma[2] yields 
an algorithm to compute cutset(si, Sj) that takes time 0(X)ceTi3 degree(c)). 

2.4 Computing a Dominating Set Fhom a Leaf Sector Cover 

We now show how using the structural property of leaf sectors proved in the 
previous section, we can compute a dominating set of T that is within 4 times 
the size of OC{l,2n). Since the size of OC(l,2n) is roughly within twice the 
size of an optimal dominating set, we have an 8-approximation of a minimum 
dominating set. 

Some definitions before we start describing the algorithm. Given a sequence 
of sectors S = (si, S 2 , - ■ ■ Sk), the connectivity graph of S, denoted K{S), is the 
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Fig. 3. Chords connecting Si and Sj are cut with two chords c and d in T 



graph with vertex set S and with Si adjacent to sj, 1 < < fc, if and only if 

i j and there is a chord in T that connects Si and Sj. Given a connectivity 
graph G, a circle drawing of G is obtained by associating with each vertex in G a 
distinct point on the circumference of a circle and by drawing the edges of G as 
chords connecting points on the circle that are associated with end vertices of the 
edge. Furthermore, we require that in a circle drawing the points corresponding 
to sectors are placed in the same order on the circumference of the circle, as the 
sectors themselves. The chords in the circle drawing of a connectivity graph are 
called drawing edges. 

In the following we describe the computation of a set ADS C T of chords that 
dominates T. Suppose that initially ADS = 0. Let J C T be the set of chords 
incident on the endpoints of sectors in OG(l, 2n). Note that it is not required that 
both endpoints of chords in J coincide with endpoints of sectors in OG(l,2n). 
We start by adding J to ADS. Let G* = OG(l, 2n) © sector(2n, 1). We use the 
connectivity graph of G*, namely K(G*), and in particular, a circle drawing D of 
K{G*), to identify the chords to add to ADS. Let P be a maximal set of pairwise 
non-intersecting chords in D. Remove from P all chords that connect adjacent 
sectors (that is, sectors that share an endpoint) in G*. Corresponding to each 
drawing edge d = (st, sj) € P, we add to ADS a chord in T connecting Si and sj. 
By the definition of a drawing edge, such a chord always exists. Corresponding to 
each drawing edge d = {si, Sj) € P, we add cutset(si, Sj) to ADS. A summary 
of the entire approximation algorithm that computes a dominating set of the 
given circle graph is given in Figured 

2.5 Analysis of the Algorithm 

The following theorems establish that the above algorithm does produce a do- 
minating set, that the size of this dominating set is within a factor of 8 of the 
optimal, and that the running time of the algorithm is 0(n^ + nm). Let MDS 
be a minimum dominating set of the given circle graph. 
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APPROXIMATION ALGORITHM FOR CMDS 
Input: a set of chords T 

Output: a set of chords ADS that dominates T 

Step 1. compute the set LS of leaf sectors 

Step 2. compute the chain of leaf sectors 0(7(1, 2n) 

Step 3. assign to ADS the set of chords incident on sectors in 0(7(1, 2n) 
let C* be 00(1, 2n) © sector(2n, 1) 

Step 4. find a maximal set of non-intersecting chords P in the circle drawing D of 
K{C*) remove from P drawing edges that connect sectors adjacent in C* 
Step 5. for each drawing edge d = (si, Sj) £ P do 

pick an arbitrary chord connecting Si and Sj and add it to ADS 
add cutset(si, Sj) to ADS 
return ADS 



Fig. 4. An outline of the approximation algorithm for CMDS 



Theorem 1. ADS is a dominating set of T. 

Proof. Let c be an arbitrary chord in T — ADS. We show that c is cut by one of 
the chords in ADS. Recall that all chords incident on endpoints of sectors in C* 
are included in ADS. Therefore, there are sectors Sj and sj in C* such that c 
connects Si and sj. Then there is a drawing edge d= {si, sj) in K{C*). We now 
consider two cases, depending on whether d belongs to P or not. Suppose first 
that d belongs to P. In this case corresponding to d, we have added to ADS, 
the set of chords cutset(si, sj) that cut all chords connecting Si and Sj. Assume 
now that d does not belong to P. In this case d cuts at least one of the drawing 
edges in P. Let / = (s;, Sj) be a drawing edge in P that cuts d. Corresponding 
to /, we have included in ADS a chord g € T with one endpoint in si and the 
other endpoint in St. It is easy to see that g cuts all chords connecting Si and 
Sj. Since the choice of c was arbitrary we conclude that ADS is a dominating 
set of T. 



Theorem 2. \ADS\ < %\MDS\ - 1. 

Proof. Let q= |(7*|. Proposition 0 implies that 

q<2\MDS\+2 (2) 

ADS contains chords in J, plus a chord corresponding to each chord in P, plus 
at most 2 chords in cutset(sj, Sj) for each chord (sj, Sj) G P. Thus 

< |J|+3|P| (3) 



Clearly, 



l-^l < 9 



( 4 ) 
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P is an outerplanar embedding of a graph with q vertices and such a graph has 
at most 2q — 3 edges. However, by removing edges between adjacent sectors, we 
leave at most g — 3 edges in P. Hence 

|P|<g-3 (5) 

Substituting inequalities (jl]) and (0 in m we get 

\ADS\ < g-h3(g- 3) = 4g- 9. 

Substituting inequality (|2]) we get \ADS\ < 8\MDS\ — 1. 



Theorem 3. The running time of the algorithm is 0{n^ + nm). 

Proof. Steps 1 and 2 of the algorithm that are needed to compute OC{l,2n) 
take -|- nm) time. Step 3 takes 0{n) time. Steps 4 and 5 take 0{n + m) 
time. 



2.6 Approximating MTDS 

A chord c G T is said to be isolated with respect to a set C7 C T if and only if c 
does not intersect any of the chords in U — {c}. We assume that T contains no 
isolated chords with respect to T, otherwise MTDS does not have a solution. 

A simple algorithm that computes a total dominating set ATDS of T is as 
follows. Initially, ATDS is empty. Start by adding ADS to ATDS. Then, for each 
chord c G ADS which is isolated with respect to ADS, pick an arbitrary chord 
d G T that cuts c and add it to ATDS. Clearly, ATDS is a total dominating set 
of T. 

Lemma 3. For any chord c G ADS — J, there is a chord in ADS — {c} that 
cuts c. 

Proof. ADS contains chords of three types: 

Type (1) those that belong to J, 

Type (2) those that correspond to edges in P, and 

Type (3) those that belong to cutset(si, Sj) for some (si,Sj) G P. 

Let c be an arbitrary chord in ADS. If c is of Type (2), that is, there is a chord 
d = {si,Sj) G P and c is a chord connecting sectors Si and sj, then c is cut 
by an element in cutset(si, Sj) C ADS. Similarly, if c is of Type (3), that is, 
c G cutset(si, Sj) for some drawing edge d = (si,Sj) G P, then the chord we 
choose to add to ADS corresponding to d, will cut c. 



Theorem 4. \ATDS\ < 10|MDS'| -k 1 
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Proof. If c G ADS — J, then Lemma |3] tells us that c is not isolated with respect 
to ADS. If c G J, then c may be isolated with respect to ADS, and in this case 
we have added to ATDS a chord in T that cuts c. Thus we have 

\ATDS\<\ADS\ + \J\ (6) 

Substituting inequalities (HD and ([2D in ©> we get \ATDS\ < \ADS\+2\MDS\ + 
2. Using the result of Theorem[2]we get \ATDS\ < 10|M_D5'| + 1. 

The result of the following theorem is immediate from Theorem |2] 

Theorem 5. The running time of the algorithm is 0{n^ + nm). 



2.7 Approximating MCDS 

We assume that T is connected, otherwise MCDS does not have a solution. To 
compute a connected dominating set of T, we start with the dominating set 
ADS and add to it a small set of chords that connects the elements of ADS. 

One straightforward approach to connect the elements of ADS using a smal- 
lest set of chords is to find a minimum Steiner tree for ADS in T. Then the 
set of chords in T corresponding to vertices of the minimum Steiner tree is a 
connected dominating set of T. Let MCDS be a minimum connected domina- 
ting set of T. It is clear that MCDSUADS contains a Steiner tree for ADS and 
therefore the number of vertices in a minimum Steiner tree for ADS is no more 
than \ADS\ + \MCDS\. This along with the result of Theorem |2] imply that the 
size of the connected dominating set found using this approach is within a factor 
of 9 of optimal. Johnson [ 2 ] mentions that the minimum Steiner tree problem 
has a polynomial time solution for circle graphs. However, this result does not 
appear explicitly in the literature. 

In the following we present a second approach to connect the elements of 
ADS, which gives us a connected dominating set of size within a factor of 14 
of optimal. For a given set of chords S, let CC{S) denote the set of connected 
components in S. 

Lemma 4. \CC{ADS)\ < 5\M DS\ + 0.5 

Proof. Inequality Q tells us that the number of chords in ADS — J is at most 
3|P|. This along with Lemma|3]imply that the number of connected components 
in ADS — J is at most 1.5|P|. Hence the total number of connected components 
in ADS is \CC{ADS)\ < | J| J- 1.5|P|. Substituting inequalities (g) and (g) we 
get \CC{ADS)\ < 2.5g — 4.5. Substituting inequality ([IJ we get \CC{ADS)\ < 
5\MDS\ + 0.5. 

For any connected component CCi, define the neighborhood of CCi, denoted 
nbd(CC'i), as the set of chords in T adjacent to at least one chord in CCi. For 
any pair of connected components CCi and CCj, define connect(C'C'i, CCj) as 
the smallest set of chords U C T such that CCi U CCj U 17 is connected. 
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Lemma 5. Let D be a dominating set of T. For any eonneeted eomponent 
CCi S CC{D), there is a connected component CCj € CC{D), j ^ i, such 
that |connect(C'Ci, CCj)| < 2. 

Proof. Let i be fixed. Since we are assuming that the input graph is connected, 
there exists a chord c S nbd(CC'i) adjacent to some chord d G T — CCi — 
nbd(CC'i) . If d £ CCj for some j ^ i, then CCi U CCj U {c} is connected and the 
lemma is true. Otherwise let e £ be a chord that cuts d (such a chord always 
exists since is a dominating set) and let CCj be the connected component in 
CC{D) that contains e. Since c £ nbd(CCi), d £ nbd(CCl,) and c is adjacent to 
d, we have that CCi 0 CCj U {c, d} is connected. 

We now use the above Lemma to compute a connected dominating set ACDS 
of T as follows. 

Step 1. initialize ACDS to ADS and compute CC{ACDS) 
while there exist CCi and CCj in CC(ACDS) 
with |connect(C'C'i, CCj) I = 1 do 

add connect(CCi, CCj) to ACDS 
Step 2. while \CC{ACDS)\ > 1 do 

pick an arbitrary connected component CCi in CC{ACDS) 
find CCj £ CC(ACDS) such that |connect(CCi, CCj)| = 2 
add connect(CCi, CCj) to ACDS 

Clearly ACDS is a connected dominating set. 

Theorem 6. \ACDS\ < 14|MCCS'| - 5. 

Proof Let CCi, CC 2 , • • • CCp be the connected components in CC{ACDS) after 
Step 1 above. At this point in the algorithm, for each pair of connected compo- 
nents CCi and CCj, 1 < j yf j < p, we have that nbd(CCi) fl nbd(CCj) = 4>. 
Thus the sets of chords CCi U nbd(CCi) for all i, 1 < i < p, are pairwise 
disjoint. This implies that MCDS contains at least one chord in each of the 
sets CCi U nbd(CCi), i = l---p. If, for some i, AICDS does not contain 
a chord in CCi U nbd(CCi), then MCDS does not cut any of the chords in 
CCi U nbd(CCi), contradicting the fact that MCDS is a dominating set. Hence 
we have that p < \MCDS\ and therefore at least \CC{ADS)\ — \MCDS\ — 
1 pairs of connected components have been processed in Step 1 of the algo- 
rithm. Thus the total number of chords introduced in Steps 1 and 2 above is 
\ACDS - ADS\ < (|CC(Ai:»S')| - \MCDS\ - 1) + 2{\MCDS\ - 1). Using the 
result of Lemma m we get | ACCS' — ADS\ < 5|MCS| -I- |MCCS| — 2.5. It is 
obvious that |MCS| < |MCCS|. Using this and the result of Theorem[5]we get 
|ACCS| < 14|MCCS| - 3.5. 



Theorem 7. The running time of the algorithm is 0{n^ + nm). 
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Proof. We assume that the algorithm dynamically maintains an n x n boolean 
matrix called nbd, whose rows correspond to chords in T — ACDS and columns 
correspond to connected components in CC{ACDS). For each chord c G T — 
ACDS and each connected component K G CC(ACDS), nbd[c, iF] = 1 if and 
only if c G nbd(iF). For each chord c, the last entry in row c stores the number 
of connected components in CC(ACDS) whose neighborhood contain c (i.e, the 
number of entries in row c equal to 1). A zero value for this entry indicates that 
the corresponding chord belongs to a connected component in CC{ACDS). This 
allows us to perform the test for the existence of two connected components CCi 
and CCj in CC{ACDS) such that |connect(CC'i, CCj)] = 1, in Step 1 of the 
algorithm, in time 0{n). The computation of CC{ACDS) in Step 1 takes time 
0(n+m) (using a depth-first search technique, for instance). The initialization of 
nbd takes time 0{SceT-ADsdegree{c)) = 0{m). It is also not hard to see that 
the matrix nbd can be updated in time 0 (-S’cg{nbd(CCi),nbd(CC 3 )}<^e 5 ree(c)) = 
0{m), each time connect{CCi,CCj) is added to ACDS. The search for CCj 
and the computation of connect(CC'i, CCI,) in Step 2 of the algorithm takes 
time 0(A'cGnbd(CCi)'^e5ree(c)) = 0{m). Since the total number of connected 
components in CC{ADS) we start with is 0{n), Steps 1 and 2 above take time 
0(nm). This along with Theorem[3] imply that the running time of the algorithm 
is 0{inf + mn). 

3 Conclusions 

The topic of approximation algorithms for dominating set problems on restricted 
classes of graphs has not received much attention from researchers as yet. Our 
results is this paper — MDS, MTDS and MODS can be approximated within 
a constant factor of optimal in polynomial time — along with the result on 
the hardness of approximating minimum independent dominating set on circle 
graphs (Damian-Iordache and Pemmaraju, in this proceedings), are probably the 
first of this kind. Improving on the factors of approximation of the algorithms 
presented in this paper remains an open interesting problem. 
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Abstract. We propose to make use of ordered binary decision diagrams 
(OBDDs) as a means of realizing knowledge-bases. We show that the 
OBDD-based representation is more efficient and suitable in some cases, 
compared with the traditional CNF-based and/or model-based represen- 
tations in the sense of space requirement. We then consider two recogni- 
tion problems of OBDDs, and present polynomial time algorithms for 
testing whether a given OBDD represents a unate Boolean function, and 
whether it represents a Horn function. 



1 Introduction 

Logical formulae are one of the traditional means of representing knowledge in 
AI [T^. However, it is known that deduction from a set of propositional clauses 
is co-NP-complete and abduction is NP-complete m- Recently, an alternative 
way of representing knowledge, i.e., by a subset of its models, which are called 
characteristic models, has been proposed (see e.g., iZlBIM)- Deduction from a 
knowledge-base in this model-based approach can be performed in linear time, 
and abduction is also performed in polynomial time [7]. 

In this paper, we propose yet another method of knowledge representation, 
i.e., the use of ordered binary decision diagrams (OBDDs) |1|2|3J . An OBDD is a 
directed acyclic graph representing a Boolean function, and can be considered as 
a variant of decision trees. By restricting the order of variable appearances and 
by sharing isomorphic subgraphs, OBDDs have the following useful properties: 
1) When a variable ordering is given, an OBDD has a reduced canonical form 
for each Boolean function. 2) Many Boolean functions appearing in practice can 
be compactly represented. 3) There are efficient algorithms for many Boolean 
operations on OBDDs. 4) When an OBDD is given, satisfiability and tautology 
of the corresponding function can be easily checked in constant time. As a result 
of these properties, OBDDs are widely used for various applications, especially 
in computer-aided design and verification of digital systems (see e.g., 121111 ). 
The manipulation of knowledge-bases by OBDDs (e.g. deduction and abduction) 
was first discussed by Madre and Coudert HD. 

We first compare the above three representations, i.e., formula-based, model- 
based, and OBDD-based, on the basis of their sizes. This will give a foundation 
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for analyzing and comparing time and space complexities of various operations. 
Comparisons between these representations have been attempted in different 
communities. In AI community, it was shown that formula-based and model- 
based representations are incomparable with respect to space requirement |7j. 
Namely, each of them sometimes allows exponentially smaller sizes than the 
other, depending on the functions. In theoretical science and VLSI design com- 
munities, formula-based and OBDD-based representations are shown to be in- 
comparable [H|. However, the three representations have never been compared 
on the same ground. We show that, in some cases, OBDD-based representation 
requires exponentially smaller space than the other two, while there are also 
cases in which each of the other two requires exponentially smaller space than 
that of OBDDs. We also point out an unfortunate result that there exists a Horn 
function which requires exponential size for any of the three representations. 

OBDDs are known to be efficient for knowledge-base operations such as de- 
duction and abduction m- We investigate two fundamental recognition pro- 
blems of OBDDs, that is, testing whether a given OBDD represents a unate 
Boolean function, and testing whether it represents a Horn function. We often 
encounter these recognition problems, since a knowledge-base representing some 
real phenomenon is sometimes required to be unate or Horn, from the hypothesis 
posed on the phenomenon and/or from the investigation of the mechanism cau- 
sing the phenomenon. For example, if the knowledge-base represents the data set 
of test results on various physical measurements (e.g., body temperature, blood 
pressure, number of pulses and so on), it is often the case that the diagnosis of 
a certain disease is monotonically depending on each test result (we allow chan- 
ging the polarities of variables if necessary). Also in artificial intelligence, it is 
common to consider Horn knowledge-bases as they can be processed efficiently 
in many respects (for example, deduction from a set of Horn clauses can be done 
in linear time 0 )- We show that these recognition problems for OBDDs can be 
solved in polynomial time for both the unate and Horn cases. 

The rest of this paper is organized as follows. The next section gives fun- 
damental definitions and concepts. We compare the three representations in 
Section 3, and consider the problems of recognizing unate and Horn OBDDs in 
Sections 4 and 5, respectively. 

2 Preliminaries 

2.1 Notations and Ftmdamental Concepts 

We consider a Boolean function / : {0, 1}" — >■ {0, 1}. An assignment is a vector 
a G {0, 1}”, whose Ath coordinate is denoted by at. A model of f is a, satisfying 
assignment of /, and the theory A(/) representing f is the set of all models of /. 
Given a,b £ {0, 1}”, we denote by a < 6 the usual bitwise (i.e., componentwise) 
ordering of assignments; < bi for all i = 1,2, ...,n, where 0 < 1. Given a 
subset E C {1,2,. ..,n}, denotes the characteristic vector of E; the i-th 
coordinate x^i equals 1 Hi £ E and Qiii ^ E. 

Let X\,X 2 , ■ ■ ■ ,Xn be the n variables of /. Negation of a variable Xi is deno- 
ted by Xi- Any Boolean function can be represented by some GNF (conjun- 
ctive normal form), which may not be unique. We sometimes do not make 
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a distinction among a function /, its theory S{f), and a CNF (p that re- 
presents /, unless confusion arises. We define a restriction of / by replacing 
a variable Xi by a constant G {0,1}, and denote it by f\xi=ai- Namely, 
f\xi=aii.xi, . ..,Xn) = f(xi, . . . , Xi-i, Gi, Xi+i, . . . , Xn) holds. Restriction may be 
applied to many variables. We also define f < g (resp., f < g) hy S{f) C S{g) 
(resp., S{f) C S{g)). 

For an assignment p G {0, 1}", we define a <p 6 if (a ©bit p) < {b ©bit p) holds, 
where ©bit denotes the bitwise exclusive-or operation. A Boolean function / is 
unate with polarity p if /(a) < f{b) holds for all assignments a and b such that 
a <p b. A theory E is unate if S represents a unate function. A clause is unate 
with polarity p if pi = 0 for all positive literals Xi and pi = 1 for all negative 
literals Xi in the clause. A CNF is unate with polarity p if it contains only unate 
clauses with polarity p. It is known that a theory E is unate if and only if E can 
be represented by some unate CNF. A unate function is positive (resp., negative) 
if its polarity is (00 • • • 0) (resp., (11 • • • 1)). 

A theory E is Horn if E is closed under operation Abit, where oAbit^ is 
bitwise AND of models a and b. For example, if a = (0011) and b = (0101), 
then a Abit b = (0001). The closure of a theory E with respect to Abit is denoted 
by C'Za^jj(A). We also use the operation Abit as a set operation; E{f) Abit ^{g) 
= {a I a = 6 Abit c holds for some b G E{f) and c G E{g)}. We often denotes 
^if) /^bit ^{g) by / Abit 5, for convenience. Note that the two functions f A g 
and / Abit <? are different. 

A Boolean function / is Horn if E{f) is Horn; equivalently if / Abit / = / 
holds (as sets of models) . A clause is Horn if the number of positive literals in it 
is at most one, and a CNF is Horn if it contains only Horn clauses. It is known 
that a theory E is Horn if and only if E can be represented by some Horn CNF. 

For any Horn theory E, a model a G A is called characteristic if it cannot 
be produced by bitwise AND of other models in A; a ^ C'lAbit(^“ {a}). The set 
of all characteristic models of a Horn theory E, which we call the characteristic 
set of E, is denoted by Char{E). Note that every Horn theory E has a unique 
characteristic set Char{E), which satisfies Cl/\^^^^{Char{E)) = E. 

2.2 Ordered Binary Decision Diagrams 

An ordered binary decision diagram (OBDD) is a directed acyclic graph that 
represents a Boolean function. It has two sink nodes 0 and 1, called the 0-node 
and the 1-node, respectively (which are together called the constant nodes). 
Other nodes are called variable nodes, and each variable node v is labeled by 
one of the variables X\,X2, ■ ■ ■ ,Xn- Let var{v) denote the label of node v. Each 
variable node has exactly two outgoing edges, called a 0-edge and a l-edge, 
respectively. One of the variable nodes becomes the unique source node, which 
is called the root node. Let X = {xi,X2, . . ■ ,Xn} denote the set of n variables. 
A variable ordering is a total ordering (a;,r(i), a;,r( 2 ), . ■ • , a;.n.(ji)), associated with 
each OBDD, where tt is a permutation {1, 2, . . . , n| — >• (1,2,..., n|. The leve^ 
of a node v, denoted by level (v), is defined by its label; if node v has label x,r(i). 



^ This definition of level may be different from its common use. 
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levd{v) is defined to be n — i + 1. The level of the constant nodes is defined to 
be 0. On every path from the root node to a constant node in an OBDD, each 
variable appears at most once in the decreasing order of their levels. 

Every node v of an OBDD also represents a Boolean function fy, defined by 
the subgraph consisting of those edges and nodes reachable from v. If node u is a 
constan t node, fy equals to its label. If node u is a variable node, fy is defined as 
var{v) fo-succ{v) V var{v) fi-succ{v) by Shannon’s expansion, where O-smcc(u) and 
l-succ{v), respectively, denote the nodes pointed by the 0-edge and the 1-edge 
of node v. The function / represented by an OBDD is the one represented by 
the root node. Given an assignment a G {0, 1}”, the value of /(a) is determined 
by following the corresponding path from the root node to a constant node in 
the following manner: at a variable node v, one of the outgoing edges is selected 
according to the assignment ayar{v) to the variable var{v). The value of the 
function is the label of the final constant node. 

When two nodes u and v in an OBDD represent the same function, and 
their levels are the same, they are called equivalent. A node whose 0-edge and 
1-edge both point to the same node is called redundant. An OBDD which has 
no equivalent nodes and no redundant nodes is reduced. The size of an OBDD is 
the number of nodes in the OBDD. Given a function / and a variable ordering, 
its reduced OBDD is unique and has the minimum size among all OBDDs with 
the same variable ordering. The minimum sizes of OBDDs representing a given 
Boolean function depends on the variable orderings [^. In the following, we 
assume that all OBDDs are reduced. 

3 Three Approaches for Knowledge-Base Representation 

In this section, we compare three knowledge-base representations: GNF-based, 
model-based, and OBDD-based. We show that OBDD-based representation is 
incomparable to the other two with respect to space requirement. 

Lemma 3.1. There exists a negative theory on n variables, for which OBDD 
and CNF both require size 0{n), while its characteristic set requires size 

Proof. Gonsider a function /a = V X 2 i), where n = 2m. □ 

Lemma 3.2. There exists a negative theory on n variables, for which OBDD 
requires size 0{n) and the characteristic set requires size 0{n^), while CNF 
requires size fl(f2^!'^'). 

Proof. Gonsider a function fs = Vi^i(® 2 i-i A X 2 i), where n = 2m. □ 

Theorem 3.1. There exists a negative theory on n variables, for which OBDD 
requires size 0(n), while both of the characteristic set and CNF require sizes 

12 ( 2 "/ 4 ). 

Proof Gonsider a function fc = (A™ i(^ 2 *-i V X 2 i)) A (Virm+i(^ 2 z-i A X 2 i)), 
where n = 4m. This theorem can be obtained from Lemmas Id.ll and fT^ □ 
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Theorem 3.2. There exists a Horn theory on n variables, for which both of the 
CNF and the characteristic set require sizes 0(n), while the size of the smallest 
OBDD representation is 

Proof Consider a function fo = (AS^(a;i,m+i V V^i ^ij)) A V 

V™ 1 Xij)) on n variables Xij, 1 < i,j < m -I- 1, where n = (m -I- 1)^. □ 

The above results show that none of the three representations always domi- 
nate the other two. Therefore, OBDDs can find a place in knowledge-bases as 
they can represent some theories more efficiently than others. 

Unfortunately, by combining Theorems 13.11 and 13.21 we can construct the 
following function, which is exponential for all representations. 

Corollary 3.1. There exists a Horn function on n variables, for which both of 
the characteristic set and CNF require sizes 12(2"/®) and the size of the smallest 
OBDD representation is 17(2^/^). 

Proof. Consider the conjunction fc fo^ where fc (resp., fjj) is defined in the 
proof of Theorem 13.11 Ireso.. Theorem 13.21) . Note that fc and fc both have n/2 
variables, but share none of the variables. □ 

4 Checking Unateness of OBDD 

In this section, we discuss the problem of checking whether a given OBDD 
is unate. We assume, without loss of generality, that the variable ordering is 
always (x„, Xn-i, ■ ■ ■ , a;i). The following well-known property indicates that this 
problem can be solved in polynomial time. 

Property 4-.1. Let / be a Boolean function on n variables X\,X 2 , ■ ■ ■ ,Xn- Then, 
/ is unate if and only if f\xt=o < f\xi=i or f\xi=o > f\xi=i holds for all i’s. □ 

An OBDD representing f\xi=o (resp., can be obtained in 0{\f\) time 

from the OBDD representing /, where |/| denotes its size [2]. The size does not 
increase by a restriction f\xi=o or f\xi=i- Since property g < h can be checked 
in 0{\g\ ■ |h|) time |3], the unateness of / can be checked in 0(n|/p) time by 
checking conditions f\xi=o < f\xi=i and f\xi=o > f\xi=i for alH = 1, 2, . . . , n. 
The following well-known property is useful to reduce the computation time. 

Property 4-2. Let / be a Boolean function on n variables x\, X 2 , ■ ■ ■ , Xn- Then, / 
is unate with polarity p = {pi,p 2 , ■ ■ ■ ,Pn) if and only if both f\xn=o and f\x„=i 
are unate with same polarity (pi,p2, • ■ ■ ,Pn-i), Pn = 0 implies f\x„=o < /U„=i 
and Pn = l implies /U„=o > /U„=i- □ 

The unateness of functions f\xn=o and /|a;„=i can be checked by applying 
Property [42] recursively, but we also have to check that f\x„=o and /|rc„=i have 
the same polarity. Our algorithm is similar to the implementation of OBDD- 
manipulation-systems by Bryant [3|, in the sense that we cache all intermediate 
computational results to avoid duplicate computation. In Bryant’s idea, different 
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Algorithm CHECK-UNATE 

Input: An OBDD representing / with a variable ordering {xn, ■ . ■ , xi). 
Output: “yes” and its polarity if / is unate; otherwise, “no”. 

Step 1 (initialize). Set I ■.= 1\ p[i] := * for alH = 1,2, ... ,n; 
rNO if(M,w) = (l,0); 

imp[u, n] < YES if (m, v) = (0, 0), (0, 1), (1, 1); 

( * otherwise. 

Step 2 (check unateness in level I and compute p\(.\)- For each 
node V in level I (i.e., labeled with xe)-, apply Steps 2-1 and 2-2. 

Step 2-1. Set pol 0 if imp[Q-succ{v), l-succ{v)] = YES holds; set 
pol ~ 1 if imp[l-succ{v),0-succ{v)] — YES holds; otherwise, output 
“no” and halt. 

Step 2-2. If p{i] = *, then set p[(.] := pol. If p[i] * and p[i] ^ pol 
hold, then output “no” and halt. 

Step 3 (compute imp in level £) . For each pair of nodes u and v such 
that level(u) < I and level(v) < I, and at least one of level(u) and level(v) 
is equal to t, set imp[u,v\ := YES if both imp\Q-succ' (u),Q-succ' {v)] and 
imp[l-succ' (u), 1-succ' (v)] are YES; otherwise, set imp[u,v\ := NO. 

Step 4 (iterate). If £ = n, where n is the level of the root node, then 
output “yes” and polarity p — {p[l],p[2\, . . . ,p\n\), and halt. Otherwise 
set £■.= I + 1 and return to Step 2. 



Fig. 1. An algorithm to check unateness of an OBDD. 



computational results may be stored to the same memory in order to handle 
different operations, and hence the same computation may be repeated more 
than once. However, as our algorithm only aims to check the unateness, it can 
avoid such cache conflict by explicitly preparing memory for each result. This is 
a key to reduce the computation time. 

We check the unateness of / in the bottom-up manner by checking unaten- 
ess of all nodes corresponding to intermediate results. Note that the property 
/U„=o < /U„=i (resp., f\x„=o > /U„=i) can be also checked in the bottom-up 
manner, since g < h holds if and only if g\xi=o < ^Uj=o and g\xi=i < ^Ui=i 
hold for some i. 

Algorithm CHECK-UNATE in Fig.[T]checks the unateness and the polarity of 
a given OBDD in the manner as described above. We use an array p[(.] to denote 
the polarity of / with respect to xg in level each element stores 0, 1 or * (not 
checked yet). We also use a two-dimensional array imp[u,v\ to denote whether 
fu < fv holds or not; each element stores YES, NO or *. In Step 2, the unateness 
with the unique polarity is checked for the nodes in level i. More precisely, the 
unateness for them is checked in Step 2-1, and the uniqueness of their polarities 
is checked in Step 2-2. In Step 3, imp[u, v] is computed for the nodes in level £. 
Namely, fu and fv are compared and the result is set to imp[u, u]. The compari- 
son is performed easily, since the comparisons between fu\xe=at: and /«|xf=ai for 
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both Qi = 0 and 1 have already been completed. In Algorithm CHECK-UNATE, 
O-succ'(v) (resp., l-SMCc'(f)) denotes O-succ(v) (resp., l-SMCc(t!)) if level{v) = i, 
but denotes v itself if level{v) < £. This is because fv\xe=o = fo-succ(v) and 
fv\xe=i = fi-succ(v) hold if level{v) = £, and fv\xi=o = fv\xt=i = fv holds if 
level (v) < £. After Step 3 is done for some £, we know imp[u,v] for all pairs of 
nodes u and v such that level{u) < £ and level{v) < £. We store all the results, 
although some of them may not be needed. 

Next, we consider the computation time of this algorithm. In Step 2, checking 
unateness for each node is performed in a constant time from the data computed 
in the previous iteration. The unateness is checked for all nodes. In Step 3, the 
comparison between /„ and /„ for each pair of nodes u and v is also performed 
in a constant time. The number of pairs compared in Step 3 during the entire 
computation is O ^(' 2 ')) = *^(I/P)) this requires 0(|/p) time. 

Theorem 4.1. Given an OBDD representing a Boolean function f of size \f\, 
checking whether f is unate can he done in 0(|/P) time. □ 

If we start Algorithm CHECK-UNATE with initial condition p[i] := 0 (resp., 
p[i] := 1) for all i’s, we can check the positivity (resp. negativity) of /. 

Corollary 4.1. Given an OBDD representing a Boolean function f of size \f\, 
checking whether f is positive (resp., negative) can he done in 0(|/p) time. □ 

5 Checking Horness of OBDD 

5.1 Conditions for Horness 

In this section, we discuss the problem of checking whether a given OBDD is 
Horn. Denoting f\x„=o and f\x„=i by /o and fi for simplicity, / is given by 
/ = Xnfo V x„fi, where /o and fi are Boolean functions on n — 1 variables 
X\,X 2 , ■ . ■ , Xn-i. By definition, we can determine whether / is Horn by checking 
the condition / Abit f = f ■ For this, we may first construct an OBDD of / Abit /, 
and then check the equivalence between / Abit / and /. However, the following 
theorem says that this approach may require exponential time and hence is 
intractable in general. 

Theorem 5.1. There exists a Boolean function f on n variables, for which 
OBDD requires size 0{n^), while the OBDD representing /Abit/ requires 
for the same variable ordering. 

Proof. Consider a function /b = {\J(f^{{xi^Xi+ra)£\Xi+sra£\{f\j(z{l^,„^ 2 m}-{i+m} 

Tj+ 2 m)))V(V”Li((^iAx*+™)Aa;i+ 2 mA(Ajg{p„„ 2 m}-{*}^i-e 2 m))), where n = 4m. 

□ 

Our main result however shows that / Abit / = / can be checked in polyno- 
mial time without explicitly constructing the OBDD of / Abit /• For this goal, 
the following lemmas tell a key property that the problem can be divided into 
two subproblems, and hence can be solved by a divide-and-conquer approach. 
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Algorithm CHECK-HORN 

Input: An OBDD representing / with a variable ordering {xn, ■ ■ ■ , xi). 
Output: “yes” if / is Horn; otherwise, “no” . 

Step 1 (initialize). Set £ := 1; hom\v\ := 

f NO if (m, n, ui) = (1, 1, 0); 

bit-imp[u, V, la] < YES if u,v,w € {0, 1} and {u, v, w) (1,1,0); 

( * otherwise. 

Step 2 (check Horness in level t). For each node v in level I (i.e., labe- 
led with ajf), set horn[v] := YES if all of horn\Q-succ{v)], horn\i-succ{v)] 
and bit-imp\Q-succ{v),l-succ{v),Q-succ(v)\ are YES; otherwise, output 
“no” and halt. 

Step 3 (compute bit-imp in level £). For each triple (u,v,w) of nodes 
such that level(u) < £, level (v) < £ and level (w) < £, and at least one of 
level{u), level{v) and level{w) is equal to £, check whether /„ Abit fv < fw 
holds according to Fig.O Set the result YES or NO to bit-imp[u,v,w]. 

Step 4 (iterate). 1£ £ = n then output “yes” and halt. Otherwise set 
£ ■.= £-\-\ and return to Step 2. 



f YES if u G {0,1}; 
[ * otherwise; 



Fig. 2. An algorithm to check Horness of an OBDD. 



Lemma 5.1. Let f he a Boolean function on n variables xi,X 2 , ■ ■ ■ ,Xn, which 
is expanded as f = Xn fo 'd Xn fi ■ Then, f is Horn if and only if both fo and fi 
are Horn and fo Abit fi < fo holds. 

The Horness of fo and f\ can be also checked by applying Lemma [5T] recur- 
sively. The following lemma says that the condition fo Abit fi < fo can be also 
checked recursively. Note that the condition of type / Abit 5 < / in Lemma 15.11 
requires to check the condition of type /i Abit 5o < fo (ke., checking of type 
/ Abit 9 < h for three functions /, g and h). 

Lemma 5.2. Let f, g and h be Boolean functions on n variables, which are ex- 
panded as f = Xn fo^ Xn fi, g = Xn ffo V gi and h = Xnho\/Xnhi, respectively. 
Then, property f Aon 9 <h holds if and only if fo Abit 9o < K, fo Abit 9i < ho, 
fi Abit go < ho and fi Abit 9i < hi hold. □ 



5.2 Algorithm to Check Horness 

Algorithm CHECK-HORN in Fig.[2]checks the Horness of a given OBDD in the 
bottom-up manner by applying Lemmas 15.1 1 and 15.21 recursively. The bottom- 
up and caching techniques used there are similar to those of CHECK-UNATE. 
However, we emphasize here that, in the case of unateness, the naive algorithm to 
check the condition of Propertv il.ll was already polynomial time, while the naive 
algorithm checking / Abit / = / would require exponential time by Theorem 15.1 1 
This CHECK-HORN is a first polynomial time algorithm, which is made possible 
by using both Lemmas 15.11 and 15.21 



Ordered Binary Decision Diagrams as Knowledge-Bases 



91 



YES if all of bit-imp[l-succ'{u),l-succ'{v),l-succ'{w)], bit-imp [0- succ' (u), 
0-succ'{v),0-succ'{w)], bit-imp[0-succ' (u) , l-succ'{v), O-smcc'(ui)] 
and bit-imp[l-succ' {u),0- succ' {v),0- succ' (w)] are YES. 

NO otherwise. 



Fig. 3. Checking bit-imp[u,v,w] (i.e., fu/\hitfv < fw) for a triple of 
nodes (u, v, w) in Step 3. 



In Algorithm CHECK-HORN, we use an array horn[v] to denote whether 
node V represents a Horn function or not, and a three-dimensional array bit-imp[u, 
V, w] to denote whether /„ Abit fv < fw holds or not; each element of the arrays 
stores YES, NO or * (not checked yet). horn[v] = YES implies that /„ is Horn 
even if fy is treated as a Boolean function on more than level{v) variables. (Re- 
call that OBDD is reduced; all the added variables are redundant.) Similarly, 
hit-imp[u,v,w] = YES implies that fu/\utfv < fw holds even if /„, fy and fyy 
are treated as Boolean functions on i (> Imax) variables, where Imax denotes 
the maximum level of the nodes u, v and w. 

In Step 2 of Algorithm CHECK-HORN, horn[v] can be easily computed 
according to Lemma [5.11 Note that every node v in level £ satisfies fy\xe=o = 
fo-succ{v) and fy\xe=i = fi-succ(v)- Also note that horn[0-succ{v)], horn[l-succ{v)] 
and bit-imp[0-succ{v),l-succ{v),0-succ{v)] have already been computed. 

Similarly, bit-imp[u,v,w] in Step 3 can be also computed easily by Fig. E] 
corresponding to Lemma f5.2l Similar to the case of checking unateness, O-smcc'(u) 
(resp., l-succ'{v)) denotes 0-succ{v) (resp., l-succ{v)) if level{v) = £, but denotes 
V itself if levelfv) < i. After Step 3 is done for some i, we have the results for all 
triples {u,v,w) of nodes such that level{u) < £, level{v) < I and level{w) < £, 
which include all the information required in the next iteration. 

Now, we consider the computation time of Algorithm CHECK-HORN. In 
Step 2, hom[v\ for each node v is computed in a constant time. The Horness is 
checked for all nodes. In Step 3, bit-imp[u, v, w] for each triple of nodes (it, v, w) is 
also computed in a constant time. The number of triples to be checked in Step 3 
during the entire computation is 0(|/p), where |/| is the size of the given OBDD, 
and this requires 0(|/|^) time. The time for the rest of computation is minor. 

Theorem 5.2. Given an OBDD representing a Boolean function f of size \f\, 
checking whether f is Horn can be done in 0(|/p) time. □ 

6 Conclusion 

In this paper, we considered to use OBDDs to represent knowledge-bases. We 
have shown that the conventional CNF-based and model-based representations, 
and the new OBDD representation are mutually incomparable with respect to 
space requirement. Thus, OBDDs can find their place in knowledge-bases, as 
they can represent some theories more efficiently than others. 
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We then considered the problem of recognizing whether a given OBDD re- 
presents a unate Boolean function, and whether it represents a Horn function. 
It turned out that checking unateness can be done in quadratic time of the size 
of OBDD, while checking Horness can be done in cubic time. 

OBDDs are dominatingly used in the field of computer-aided design and 
verification of digital systems. The reason for this is that many Boolean functions 
which we encounter in practice can be compactly represented, and that many 
operations on OBDDs can be efficiently performed. We believe that OBDDs are 
also useful for manipulating knowledge-bases. Developing efficient algorithms for 
knowledge-base operations such as deduction and abduction should be addressed 
in the further work. 
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Abstract. In this paper we aim at an understanding of the fundamental 
algorithmic limitations on what a set of autonomous mobile robots can 
or cannot achieve. We study a hard task for a set of weak robots. The 
task is for the robots in the plane to form any arbitrary pattern that 
is given in advance. The robots are weak in several aspects. They are 
anonymous; they cannot explicitly communicate with each other, but 
only observe the positions of the others; they cannot remember the past; 
they operate in a very strong form of asynchronicity. We show that the 
tasks that such a system of robots can perform depend strongly on their 
common knowledge about their environment, i.e., the readings of their 
environment sensors. 



1 Introduction, Definitions, and Overview 

1.1 Autonomous Mobile Robots 

We study the problem of coordinating a set of autonomous, mobile robots in the 
plane. The coordination mechanism must be totally decentralized, without any 
central control. The robots are anonymous, in the sense that a robot does not 
have an identity that it can use in a computation, and all robots execute the 
exact same algorithm. Each robot has its own, local view of the world. This view 
includes a local Cartesian coordinate system with origin, unit of length, and the 
directions of two coordinate axes, identified as x axis and y axis, together with 
their orientations, identified as the positive sides of the axes. The robots do not 
have a common understanding of the handedness (chirality) of the coordinate 
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University. We also gratefully acknowledge the partial support of the Swiss Natio- 
nal Science Foundation, the Natural Science and Engineering Research Council of 
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system that allows them to consistently infer the orientation of the y axis once 
the orientation of the x axis is given; instead, knowing North does not distinguish 
East from West. The robots observe the environment and move; this is their only 
means of communication and of expressing a decision that they have taken. The 
only thing that a robot can do is make a step, where a step is a sequence of three 
actions. First, the robot observes the positions of all other robots with respect 
to its local coordinate system. Each robot is viewed as a point, and therefore 
the observation returns a set of points to the observing robot. The robot cannot 
distinguish between its fellow robots; they all look identical. In addition, the 
robot cannot detect whether there is more than one fellow robot on any of 
the observed points; we say, it cannot detect multiplicity. Second, the robot 
performs an arbitrary local computation according to its algorithm, based only 
on its common knowledge of the world (assumed e.g. to be stored in read-only- 
memory and to be read off from sensors of the environment) and the observed 
set of points. Since the robot does not memorize anything about the past, we call 
it oblivious. For simplicity, we assume that the algorithm is deterministic, but 
it will be obvious that all of our results hold for nondeterministic algorithms as 
well (randomization, however, makes things different). Third, as a result of the 
computation, the robot either stands still, or it moves (along any curve it likes). 
The movement is confined to some (potentially small) unpredictable, nonzero 
amount. Hence, the robot can only go towards its goal along a curve, but it 
cannot know how far it will come in the current step, because it can fall asleep 
anytime during its movement. While it is on its continuous move, a robot may 
be seen an arbitray number of times by other robots, even within one of its steps. 

The system is totally asynchronous, in the sense that there is no common 
notion of time. Each robot makes steps at unpredictable time instants. The 
(global) time that passes between two successive steps of the same robot is 
finite; that is, any desired finite number of steps will have been made by any 
robot after some finite amount of time. In addition, we do not make any timing 
assumptions within a step: The time that passes after the robot has observed the 
positions of all others and before it starts moving is arbitrary, but finite. That is, 
the actual move of a robot may be based on a situation that lies arbitrarily far 
in the past, and therefore it may be totally different from the current situation. 
We feel that this assumption of asynchronicity within a step is important in a 
totally asynchronous environment, since we want to give each robot enough time 
to perform its local computation. 

1.2 Pattern Formation 

In this paper, we concentrate on the particular coordination problem that re- 
quires the robots to form a specific geometric pattern, the pattern formation 
problem. This problem has been investigated quite a bit in the literature, mostly 
as an initial step that gets the robots together and then lets them proceed in 
the desired formation (just like a flock of birds or a troupe of soldiers); it is 
interesting algorithmically, because if the robots can form any pattern, they can 
agree on their respective roles in a subsequent, coordinated action. We study this 
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problem for arbitrary geometric patterns, where a pattern is a set of points (gi- 
ven by their Cartesian coordinates) in the plane. The pattern is known initially 
by all robots in the system. For instance, we might require the robots to place 
themselves on the circumference of a circle, with equal spacing between any two 
adjacent robots, just like kids in the kindergarten are sometimes requested to 
do. We do not prescribe the position of the circle in the world, and we do not 
prescribe the size of the circle, just because the robots do not have a notion of 
the world coordinate system’s origin or unit of length. The robots are said to 
form the pattern, if the actual positions of the robots coincide with the points 
of the pattern, where the pattern may be translated, rotated, scaled, and flipped 
into its mirror position in each local coordinate system. Initially, the robots are 
in arbitrary positions, with the only requirement that no two robots are in the 
same position, and that of course the number of points prescribed in the pattern 
and the number of robots are the same. Note that in our algorithms, we do not 
need to and we will not make use of the possibility of rotating the pattern. 

The pattern formation problem for arbitrary patterns is quite a general mem- 
ber in the class of problems that are of interest for autonomous, mobile robots. 
It includes as special cases many coordination problems, such as leader election: 
We just define the pattern in such a way that the leader is represented uniquely 
by one point in the pattern. This reflects the general direction of the investiga- 
tion in this paper: What coordination problems can be solved, and under what 
conditions? The only means for the robots to coordinate is the observation of the 
others’ positions; therefore, the only means for a robot to send information to 
some other robot is to move and let the others observe (reminiscent of bees in a 
bee dance) . For oblivious robots, even this sending of information is impossible, 
since the others will not remember previous positions. Hence, our study is at the 
extreme end in two ways: The problem is extremely hard, and the robots are 
extremely weak. 

In an attempt to understand the power of common knowledge for the co- 
ordination of robots, we study the pattern formation problem under several 
assumptions. We give a complete characterization of what can and what cannot 
be achieved. First, we show that for an arbitrary number of robots that know the 
direction and the orientation of both axes, the pattern formation problem can 
be solved. Here, knowing the direction of the x axis means that all robots know 
and use the fact that all the lines identifying their individual x axes are par- 
allel. Similarly, knowing the orientation of an axis means that the positive side 
of that axis in the coordinate system coincides for all robots. Second, we study 
the case of the robots knowing one axis direction and orientation. We show that 
the pattern formation problem can be solved whenever the number of robots 
is odd, and that it is in general unsolvable when the number of robots is even. 
Third, we show that the situation is the same, if one axis direction is known, but 
not the orientation of the axis. Fourth, we show that if no axis direction (and 
therefore also no orientation) is known, the problem cannot be solved in general. 
For brevity all proofs are omitted. The reader is referred to |4]. 
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1.3 Related Work 

The problem of controlling a set of autonomous, mobile robots in a distributed 
fashion has been studied extensively, but almost exclusively from an engineering 
and from an artificial intelligence point of view. In a number of remarkable 
studies (on social interaction leading to group behavior [^, on selfish behavior 
of cooperative agents in animal societies |^, on primitive animal behavior in 
pattern formation [^, to pick just a few), algorithmic aspects were somehow 
implicitly an issue, but clearly not a major concern, let alone the focus, of the 
study. 

We aim at identifying the algorithmic limitations of what autonomous, mobile 
robots can or cannot do. An investigation with this flavor has been undertaken 
within the AI community by Durfee [2], who argues in favor of limiting the 
knowledge that an intelligent agent must possess in order to be able to coordinate 
its behavior with others. The work of Suzuki and Yamashita IzlSi is closest 
to our study (and, with this focus, a rarity in the mobile robots literature); it 
gives a nice and systematic account on the algorithmics of pattern formation 
for robots, under several assumptions on the power of the indivdual robot. The 
models that we use differ from those of |lll|9| in the fact that our robots 
are as weak as possible in every single aspect of their behavior. The reason is 
that we want to identify the role of the robots’ common knowledge of the world 
for performing a task. In contrast with BIS10, we do not assume that on a 
move, we know ahead of time the limited, but nonzero distance that a robot 
travels. We do not assume that the distance that a robot may travel in one 
step is so short that no other robot can see it while it is moving. We do not 
assume that the robots have a common handedness, called sense of orientation 
in [ZUn]. 

The most radical deviation from previous models may, however, be our as- 
sumption of asynchronicity within one step. In contrast, [3 019] assume the 
atomicity of a step: A robot moves immediately after it has observed the current 
situation, with all awake robots moving at the same clock tick (some robots may 
be asleep). This difference influences the power of the system of robots so dra- 
stically that in general, algorithms that make use of atomicity within one step 
do not work in our model; in particular, this is true for the work in 13012]. 

2 Knowledge of Both Axis Directions and Orientations 

For the case in which the directions and orientations of both axes are common 
knowledge, the robots can form an arbitrary given pattern when each robots 
executes the following algorithm in each step. 

Algorithm 1 (Both axis directions and orientations). 

Input: An arbitrary pattern P described as a sequence of points pi, . . . ,p„, 
given in lexicographic order. The directions and orientations of the x axis and 
the y axis is common knowledge. 
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Begin 

a \= Angle (pi, P 2 ); 

Give a lexicographic order to all the robots in the system, 

including myself, say from left to right and from bottom to top; 

A :=First robot in the order; 

B :=Second robot in the order; 

/3 := AngleCA, B)\ 

Iff am B Then Do_nothing() 

Else If I am A 

Then If a = /3 Then Do_nothing() 

Else Go_Into_Position(7l, B, a) 

Else %1 am neither A nor B% 

If a = (3 

Then Unit := AB; 

% all the robots agree on a common unit distance % 
Final-Positions := Find_Final_Positions(T,i?,[/nzt); 

If I am on one of the Final -Positions 

Then Do_nothing() 

Else FreC-Robots := {robots not on one of the 
Final -Positions}; 

Free -Points := {Final -Positions with no robots 
on them}; 

Go-ToA>o±nts(Free-Robots , Free-Points ) ; 

Else Do_nothing() 

End 

Angle (p, q) computes the angle between the positive horizontal axis passing 
through p and the segment pq. Do_nothing() terminates the local computation 
and the current step of the calling robot. 

Go_lnto_Position(7l, B, a) orders A to move so as to achieve angle a 
with B while staying lexicographically first. 

Find_Final_Positions(yl, B, Unit) figures out the final positions of the 
robots according to the given pattern, and the positions of A and B. The common 
scaling of the input pattern is defined by the common unit distance Unit. 

Go -lo JPolnts {Free-Robots , Free-Points) chooses the robot in Frec-Robots 
that is closest to a point in Free-Points and moves it, as follows: 

Go_To_Points (Free-Robots , Free-Points) 

Begin 

(r,p) := H±n±m.ma.( Free -Robot s , Free-Points); 

Iff am r Then Move (p) 

Else DojnothingO 
End 

Minimum (Eree_i?o6ots, FreC-Points) finds one of the Free-Robots that has 
the minimum Euclidean distance from one of the Free-Points (i.e. with no robot 
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Fig. 1. Breaking Symmetry from (a) to (b); defining the sides (c); the unbreakable 
symmetry of a 5-gon (d) 



on it). If more than one robot has minimum distance, the one smaller in the 
lexicographic order is chosen. 

Move (p) terminates the local computation of the calling robot and moves it 
towards p. 

Theorem 1. With Algorithm]^ the robots correctly form the input pattern P . 

3 Knowledge of One Axis Direction and Orientation 

Let us now look at the case when only one axis direction and orientation are 
known. As an aside, note that this case would trivially coincide with the first 
one, if the robots would have a common handedness (or sense of orientation, as 
Suzuki and Yamashita call it HUS]). We first show that in general, it is impossible 
to break the symmetry of a situation. We will then show that for the special case 
of an odd number of robots, symmetry can be broken and an arbitrary pattern 
can be formed. 

Theorem 2. In a system with n anonymous robots that agree only on one axis 
direction and orientation, the pattern formation problem is unsolvable when n is 
euen. 

In contrast, we now show that for breaking the symmetry, it is enough to 
know that the number n of robots is odd. 

Algorithm 2 (One axis direction and orientation). 

Input: An arbitrary pattern P described as a sequence of points pi, ... ,pn, gi- 
ven in lexicographic order. The direction and orientation of the y axis is common 
knowledge. 
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Begin 

If We are in a final configuration Then Do jiothingO ; 

Prn ~ MediEin_Pattern_Point(P);% median pattern point in x direction % 
p := Outermost_Pattern_Point (P) ; % outermost pattern point w.r.t. Pm% 
Pattern Unit -Length := Horizontal distance in the pattern 

between the vertical lines through pm and p; 

K MedicUi_Robot_Line(); % through the median robot position % 

{Outer -Robot, Outer-Line, positive-X -orientation):= 

Outermost_Robot_Position(K') ; 

Median-Robot := Find_Median_Robot (KT) ; 

Final -Positions := Find_Final_Positions(Medfan_i?o6ot, 

positive-X -Orientation, Pattern-Unit-Length, 
Distance(iC, Outer -Line) / 2)', 

Frec-Points := {Final -Positions on my side with no robots on them}'. 

If I am the Median-Robot Then Do_nothing() 

Else If I am the Outer-Robot 

Then If Free-Points contains just one (last) free point 
Then Move_Towards (Zast free point) 

Else Do_nothing() 

Else If I am on one of the Final -Positions Then Do_nothing() 

Else Free-Robots := {robots on my side not on Final-Positions}', 
Go-TodPo±nts(Free-Robots , Free-Points) 



End 



Median_Pattern_Point (P) finds the median point in direction of x in the 
input pattern P according to the local orientation of the axis. The ordering is 
given left-right, bottom-up. 

Outermost_Pattern_Point (P) finds the point in the input pattern P that 
lies on the vertical line farthest from the vertical line through the median. If 
more than one point exists, then the highest and rightmost is chosen according 
to the local orientation of the axis. 

Mediaji_Robot_Line() returns the vertical line through the median robot po- 
sition. Note that the position of this line does not depend on the local orientations 
of the X axes of the robots. 

Outermost_Robot_Position(PT) uniquely determines a robot that is outer- 
most with respect to K. It does so by breaking symmetry of the situation, if 
necessary, in the following way (see Figure [H( a) and (b)): 

Outermost .Robot _Posit ion (iC) 

Begin 

{Outer -Robot, Outer -Robot' , unique) := Outer _Two Jtobots (iO ; 

If not unique 

Then If Symmetric (iO 

Then If I am the median of the points on K 
Then Move (to my right by e) 

Else DojnothingO 
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Else positivC-X -Orientation := Outermost_Asymmetry ; 

Outer-Robot := Choose between OMter_i?o&oi and Outer _i?o6ot' 
the one that lies on the positive-X -orientation] 
If I am Outer -Robot Then Move (away from K by e) 

Else Do_nothing() 

Else positive-X -Orientation := side on which Outer-Robot lies; 

Outer -Line := vertical line on which Outer -Robot lies; 

Return {Outer-Robot, Outer -Line, positive-X -orientation) 

End 

Outer_Two_Robots (iti) finds either two or a unique single topmost robot(s) 
that lie(s) on the farthest vertical line(s) from K. The variable unique tells 
whether the robot found has been unique. 

Symmetric (AO returns True if the current configuration of the robots in the 
system is symmetric with respect to K, in the following sense. The configuration 
is called symmetric with respect to K, if there is a perfect matching for all the 
robots not on K, such that any two matched robots lie on two vertical lines that 
are in symmetric position with respect to K (see Figure [T](a)). 

Outermost_Asymmetry (iO identifies, for an asymmetric configuration, the 
unique halfplane with respect to K in which a robot r lies that (in some mat- 
ching) has no symmetric partner with respect to K, and that is on the farthest 
vertical line from K among all robots with this property (see Figure [TJb)). 

Find_MediEin_Robot (AO finds the median robot position in the current con- 
figuration of the robots. This median robot position splits the robot positions 
into two equal size subsets, of size (n — 1) /2, defined as follows. One subset is in 
the halfplane of positivc-X-orientation, including the points on K that are above 
the median robot position, and the other subset is in the other halfplane of K, 
including the points on K that are below the median robot position; from now 
on, we will call these the two sides (see Figure[T](c)). According to this definition, 
the median robot position is unique, and each robot (even if it lies on K) can 
decide to which side it belongs. 

Find_Final_Positions(Median_A?o6of, Side, Pattern-Unit-Length, 

World -Unit -Length) returns the set of final positions of the robots according 
to the given pattern, based on the agreement on Median-Robot and on 
positive-X -Orientation. The common scaling of the input pattern is defined by 
identifying Pattern-Unit-Length with World-Unit-Length. 

Distance (AF, D returns the horizontal distance between the two vertical 
lines K and L. 



3.1 Correctness 

To see that the above algorithm solves the pattern formation problem for an 
arbitrary pattern, we argue as follows. First, we show that the robots initi- 
ally arrive at an agreement configuration, by breaking symmetry if necessary. 
Then, they translate and scale the pattern with respect to the median and the 
outermost point, and finally they move to their destinations. To present this 
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argument in more detail, we start with a brief definition. A configuration (of 
the robots) is a set of robot positions, one position per robot, with no position 
occupied by more than one robot. An agreement configuration is a configuration 
of the robots in which all robots agree on a unique median robot and a unique 
outermost robot, as defined in the above routines Find_Median_Robot (AT) and 
Outermost_Robot_Position(A') . A final configuration is a configuration of the 
robots in which the robots form the desired pattern. Note that a final configu- 
ration might or might not be an agreement configuration. 

Theorem 3. With Algorithm 2, the robots correctly form the input pattern P. 

4 Knowledge of One Axis Direction 

Note that the difference to the previous section is only the lack of knowledge 
about the axis orientation. For solving the pattern formation problem in this 
case, we can use an algorithm similar to the one used in Section [3] We easily 
observe that with slight modifications in Algorithm [2l the agreement on the 
orientation of the x axis could have been achieved without using the knowledge 
of the orientation of the y axis. More precisely, we can do the following: 

1. Find the vertical line K on which the median robot lies (as before, this is 
independent from the direction of a;). 

2. Find the Outermost robots with respect to K. Since we do not know the 
orientation of the given axis, let’s say the y axis, it is possible that we find 
more than one outermost robot; there are at most four of them, on both 
sides of K to the top and to the bottom. In this case, we will detect an 
(outermost) asymmetry with respect to K as before, or create it as follows. 
If the configuration is symmetric with respect to K, the median robot is 
uniquely identified, and it moves by some small amount e > 0 to its right, 
breaking the symmetry. So now, as in the previous section, all the robots 
agree on the positive direction of the x axis (the side where the outermost 
asymmetry lies), and at most 2 outermost robots remain (the bottom and 
top ones on the positive x side, say). 

3. The same technique and argument now applies to the x axis as the given 
one. In this way, an agreement also on the orientation of the y axis can be 
reached. Now, we can select a unique Outermost robot out of the at most 
two that were remaining, and let it (for convenience) move by e outwards. 

4. The robots can compute their unique final positions and go towards them, 
in the same way as in Algorithm [21 

Using Theorem |2] we therefore conclude: 

Theorem 4. With eommon knowledge of one axis direction, an odd number of 
autonomous, anonymous, oblivious, mobile robots can form an arbitrary given 
pattern, while an even number eannot. 
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5 No Knowledge 

The following theorem states that giving up the common knowledge on at least 
one axis direction leads to the inability of the system to form an arbitrary pat- 
tern. 

Theorem 5. With no common knowledge, a set of autonomous, anonymous, 
oblivious, mobile robots cannot form an arbitrary given pattern. 

6 Discussion 

We have shown that from an algorithmic point of view, only the most fundamen- 
tal aspects of mobile robot coordination are being understood. In a forthcoming 
paper, we propose two algorithms for the point formation problem for oblivious 
robots; the first one does not need any common knowledge, and the second one 
works with limited visibility, when two axes are known |3]. There is a wealth 
of further questions that suggest themselves. For example, we have shown that 
an arbitrary pattern cannot always be formed; it is interesting to understand 
in more detail which patterns or classes of patterns can be formed under which 
conditions, because this indicates which types of agreement can be reached, and 
therefore which types of tasks can be performed. Slightly faulty snapshots, a 
limited range of visibility, obstacles that limit the visibility and that moving ro- 
bots must avoid or push aside, as well as robots that appear and disappear from 
the scene clearly suggest that the algorithmic nature of distributed coordination 
of autonomous, mobile robots merits further investigation. 
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Abstract. We study load balancing problems with temporary jobs (i.e., 
jobs that arrive and depart at unpredictable time) in two different con- 
texts, namely, machines and network paths. Such problems are known 
as machine load balancing and virtual circuit routing in the literature. 
We present new on-line algorithms and improved lower bounds. 



1 Introduction 

In this paper we study on-line algorithms for load balancing with temporary 
jobs (i.e., jobs that arrive and depart at unpredictable time) in two different 
contexts, namely, machines and network paths. Such problems are known as 
machine load balancing and virtual circuit routing in the literature (see [1 11 
for a survey). As for the former, we investigate a number of settings, namely, the 
list model, the interval model and the tree model. Our results show that these 
settings, though similar, cause the complexity of the load balancing problem to 
vary drastically, with competitive ratio jumping from 0(1) to 0(logn) and to 
0{y/n). We also study these settings in the more general cluster-based model. 
Regarding the virtual circuit routing problem, we give the first algorithm with a 
sub-linear competitive ratio of 0(m^/^) when the network contains m edges with 
identical capacity. We also improve the lower bound from l7(m^/‘*) to 17 (to^/^). 
When edge capacities are not identical, our algorithm is 0(kU^/^)-competitive, 
where W is the total edge capacity normalized to the minimum edge capacity. 

1.1 On-Line Machine Load Balancing 

We study the following on-line problem. There are n machines with identical 
speed. Jobs arrive and depart at unpredictable time. Each job comes with a 
positive load. When a job arrives, it must be assigned immediately to a machine 
in a non-preemptive fashion, increasing the load of that machine by the job load 
until the job departs. The objective is to minimize the maximum load of any 
single machine over all time. As with previous work, we measure the performance 
of an on-line algorithm in terms of competitive ratio (see [1 1| for a survey), which 
is the worst-case ratio of the maximum load generated by the on-line algorithm 
to the maximum load generated by the optimal off-line algorithm. 

The above on-line load balancing problem has been studied extensively in 
the literature (see e.g., [T^ 171 IFl H] ) . Existing results are distinguished by the 
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Table 1. Competitive ratios in different settings of assignment restriction. 



Model 


, Two intervals. Tree, 

List Interval , t ■. ^ 

Arbitrary restriction 


n machines 
k clusters 


0(1) 0(logn) 0(%Ai) 

0(1) eilogk) 0(%/A) 



presence of restrictions on machine assignment. In the simple case, every job can 
be assigned to any machine. It has been known for long that Graham’s greedy 
algorithm is (2 — o(l))-competitive |H1[5|- A matching lower bound was obtained 
recently by Azar and Epstein j^. In the model with assignment restriction, each 
job specifies an arbitrary subset of the machines for possible assignment. For this 
model, Azar, Broder, and Karlin proved that the competitive ratio of any on- 
line algorithm is Q{y/n). An algorithm (called Robin-Hood) with a matching 
upper bound is given in [J|. 

Notably, the allowance of arbitrary assignment restriction makes the problem 
significantly harder. It is interesting to investigate the complexities of the settings 
where the assignment restriction is allowed in a more controllable manner. In 
particular, Bar-Noy et al. [Hj initiated the study of the following hierarchical 
model. The machines are related in the form of a tree. Each job specifies a 
machine M, so that the algorithm is restricted to choose a machine among the 
ancestors of M . Bar-Noy et al. showed that when the hierarchy is linear (i.e., the 
list model), we can achieve 0(1) competitive ratio. The complexity of the general 
tree model was left open. In this paper we show that the tree model actually 
admits an lower bound. In other words, the tree model, though more 

controllable, does not offer any advantage over arbitrary assignment restriction. 

Intuitively, the list model orders the machines according to their capability, 
and a job specifies the least capable machine that can serve the job. A natural 
extension is that a job specifies both the least and the most capable machines (as 
in many applications, more capable machines would charge more). We call this 
model the interval model. The previous 0(l)-competitive algorithm fails to work 
here. We find that there is indeed an l7(logn) lower bound, and we obtain an 
0(log n)-competitive algorithm. On the other hand, if a job is allowed to request 
two or more intervals, we show that every algorithm is f2(-yn)-competitive. 

This paper also initiates the study of cluster-based assignment restriction, 
which is a practical extension of machine-based assignment restriction. A cluster 
is a collection of machines with same functionality. More formally, the cluster 
model states that each machine belongs to one of k clusters, and each job requests 
some clusters in which any machine can be used to serve the job. Similar to 
machine-based models, we also study clusters related in the form of lists, intervals 
and trees. The machine-based algorithms can be easily adapted to those settings, 
giving 0(1), O(logn) and 0{^/n) upper bounds respectively. However, it is more 
desirable to derive algorithms with competitive ratios depending on k instead 
of n, the total number of machines, since in reality, k is much smaller than n. 
For the list model and the interval model, we observe that the competitive ratios 
are 0(1) and 0(log/c), respectively. For the tree model, we have a more general 
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algorithm which actually works for the case where a job can request any clusters 
arbitrarily. Let Smin be the number of machines in the smallest cluster. Denote 
K as n/smin) the total normalized number of machines. The competitive ratio 
of our algorithm is 0{'/K). Note that k < K < n. If the clusters are of roughly 
the same size, then K is 0{k). We conjecture that this result can be further 
improved to 0{Vk) for general trees. To support this conjecture, we give an 
0(\/fc)-competitive algorithm for the special case in which the clusters form a 
tree consisting of two levels. Table [I] shows a summary of these results. 

Related Work: Two other variants of the machine load balancing problem 
have also been studied extensively in the literature. They include models in which 
jobs never depart, and in which jobs can be reassigned [21[TD]|T2J[SI[IIIIS1II3]- For 
details, readers can refer to the surveys of Azar jlj and Borodin and El-Yaniv m- 

1.2 On-Line Virtual Circuit Routing 

The virtual circuit routing problem is a generalization of the machine load ba- 
lancing problem to the context of high speed networks mia. The virtual circuit 
routing problem is defined as follows: We are given a directed graph with m 
edges. Every edge e is associated with a capacity Cg. Again, jobs can arrive 
and depart in an unpredictable fashion. Each job requests a route of a certain 
weight w from a source to a destination. When a job arrives, an on-line algo- 
rithm assigns the job to a path connecting the source to the destination, thereby 
increasing the load of every edge e along that path by w/cg until the job departs. 
The objective is to minimize the maximum load generated on any single edge 
over all time. The performance is again measured in terms of competitive ratio. 

It is widely known that the lower bound on the competitive ratio of 

machine load balancing with assignment restriction can lead to an lo- 

wer bound on the competitive ratio of the virtual circuit routing problem I21IE]- 
This lower bound holds even when all edges have the same capacity. An inte- 
resting open problem in the literature is to determine the competitive ratio of the 
virtual circuit routing problem (see e.g., HD- Prior to our work, the only related 
result is the work of Awerbuch et al. [3j, who showed that if limited re-routing 
is allowed (i.e., the path to which a job is assigned can change dynamically), an 
0(logn)-competitive algorithm exists. 

In this paper we study the original virtual circuit routing problem and pre- 
sent an algorithm which is 0(m^/^)-competitive when all edges have the same 
capacity. In addition, we improve the lower bound to For networks 

with general edge capacities, the competitive ratio of our algorithm becomes 
(9(LL2/3), lY jg total edge capacity normalized to the minimum edge 

capacity (i.e., Ce/cmin, where Cmin is the minimum edge capacity). 

2 Machine-Based Assignment Restriction 

In this section we study the competitiveness in different settings of assignment 
restriction. In the tree model, machines are nodes of a tree. Each job specifies a 
machine, and the on-line assignment algorithm can assign the job to any ancestor 
of the specified machine in the tree. The list model is a special case where 
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Fig. 1. Properly aligned sublists when n = 16. 



machines are nodes of a list. Each job specifies a machine, and the algorithm can 
assign the job to any machine in the list between the list head and the requested 
machine. A 5-competitive algorithm for the list model has been known [U]. For 
the tree model, the best known algorithm is the 0(y/n)-competitive algorithm 
inherited from arbitrary assignment restriction | 7 ]. 

We define the interval model as an extension of the list model. Each job 
specifies two machines, and the algorithm may choose any machine in the list 
between these two machines to serve the job. In this section we show that this 
extension raises the competitive ratio to 6>(logn). The lower bound result holds 
even if we add the assumption that the jobs never depart and all jobs generate 
the same load. We also show that all algorithms in the tree model is Q{^/n)- 
competitive. A similar result is obtained when we further extend the interval 
model to allow two intervals per request. 



2.1 The Interval Model 

We first show an 0(log n)-competitive algorithm (called Interval) for the in- 
terval model. Then we state an 12(logn) lower bound on the competitive ratio, 
showing that Interval is optimal. In this extended abstract we omit the proof 
of the latter. To ease our discussion, we assume n is power of two. 

For the list model, a 5-competitive algorithm was shown in |U]. This algorithm 
will be referred to as Linear. Our algorithm Interval identifies 2n — 2 special 
sublists, and run a copy of Linear on each of them. These sublists are called 
properly aligned sublists. For each i between 0 and logn — 1, we partition the 
list of machines evenly into 2® intervals named l{i,0) to /(f,2® — 1). Each of 
these intervals is further subdivided into two properly aligned sublists: the left 
one is in reverse order, and the right one R{i,j) is in the original order. 

Figure [l] shows an example. Note that each machine is contained in exactly 
log n properly aligned sublists. 

Suppose a job arrives, requesting an interval [I, r]. Interval finds the sublist 
I{i,j) with the largest i such that I{i,j) contains the interval [l,r]. Since /(0,0) 
contains all machines, such a pair always exists. Let I and r be the number of 
machines in L(i, j) and R(i,j) respectively which are also in [l,r]. Note that, due 
to the maximum i requirement, the head of L{i,j) (respectively R{i,j)) is always 
included in [l,r] whenever I > 0 (respectively f > 0). Interval dispatches the 
job to the copy of Linear running on L{i,j) if ( > f, and to the copy of Linear 
running on R{i,j) otherwise. Interval would assign the job to the machine 
returned by the appropriate copy of Linear. 
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Theorem 1. Interval is {IQlogn) -competitive. 

Proof. Assume to the contrary that there is a sequence of jobs such that the op- 
timal algorithm O generates a maximum load of OPT, and Interval generates 
a maximum load of more than 10 log n OPT. Consider the time when the maxi- 
mum load of Interval occurs in machine M . Since there are only log n properly 
aligned sublists containing M, in at least one of them M has a load of more than 
lOOPT. Without loss of generality, suppose this sublist is L(*o, jo)- 

Consider all jobs dispatched to the sublist L{io,jo). As scheduled by Linear, 
M receives a load of more than lOOPT. We show in the following paragraphs 
that it is possible to construct an off-line assignment A for L{io,jo) so that each 
machine receives a load of at most 2 OPT at any time. Therefore, Linear is not 
5-competitive, a contradiction occurs. 

For each job, we find the machine a to which O assigned the job. Note that 
cr is a machine in either L{iQ,jo) or R{iQ,jo), since they contain the requested 
interval. If a is in L{io,jo), A also assigns the job to a. Otherwise, if a is the 
r-th machine in R{io,jo)i assigns the job to the r-th machine in L(iQ,jo). This 
is always allowed: for Interval to dispatch the job to L{io,jo), the number of 
machines permissible in this sublist must be more than that in R(io,jo). 

On the other hand, the jobs assigned to each machine by A is simply the 
union of jobs assigned to two machines in O, each having a load of at most OPT. 
A thus gives rise to a load of at most 2 OPT at any time. □ 

Theorem 2. No on-line algorithm for the interval model has eompetitive ra- 
tio less than logn/2. This holds even if jobs never depart, and even if all jobs 
generate the same load. 

2.2 The Tree Model 

We show that the 0(-v/n)-competitive algorithm Robin-Hood introduced in 
is asymptotically optimal for the tree model. Note that if jobs never depart, 0(1) 
competitive ratio can be achieved [^. 

Theorem 3. No on-line algorithm for the tree model is (-^/n — 1)- competitive. 
This is true even if all jobs generate the same load. 

Proof. The proof is adapted from the lower bound proof for arbitrary assignment 
restriction presented in US]. Consider the following tree of -|- r nodes, with 
r non-leaf nodes forming a list, and leaf nodes being children of the tail of 
this list. The following job sequence ensures that any on-line algorithm assigns 
to one of the nodes at least r jobs. The job sequence consists of r^ phases. In 
the p-th phase, r jobs are released requesting an ancestor of the p-th leaf. If the 
on-line algorithm assigns all these jobs to the leaf, we are done. Otherwise, we 
retain a job assigned to a non-leaf node, and let all other jobs depart. After 
phases, the non-leaf nodes must be serving at least jobs, so one of them must 
be serving at least r jobs. 

In an off-line assignment, only non-departing jobs are assigned to leaf nodes. 
The maximum load created is 1. So the on-line algorithm is no better than r- 
competitive. Since r > ^Jn — 1, the theorem follows. □ 
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In the above argument, if we number the non- leaf nodes as 1 to r, and the 
leaf nodes as r -|- 1 to -|- r, all the jobs have assignment restriction in the form 
{1, 2, • • • , r, j}. Therefore, if we extend the interval model so that each job can 
specify two intervals, the same lower bound holds. 

Corollary 4. No on-line algorithm for the two-interval model is (\/n — 1)- 
competitive. This is true even if all jobs generate the same load. 



3 Cluster-Based Assignment Restriction 

In reality, assignment restriction is usually used to model the requirement of jobs 
for some discrete capabilities possessed only by some machines. The number 
of the distinct sets of capabilities possessed by the machines is usually much 
smaller than the number of machines. This motivates us to study cluster-based 
assignment restriction models, an extension in which each machine belongs to 
one of k clusters. Each job specifies some clusters, so that only machines in 
these clusters can serve the job. We define the list, interval, 2-interval, tree and 
arbitrary restriction models analogous to the machine-based models in Sect . [2 

Here are some simple observations. In the extreme case in which each cluster 
contains only one machine, the cluster-based models reduce to the machine-based 
models. As a result, all the lower bounds in Sect. [2l still apply, replacing n by fc in 
the respective bounds. As mentioned earlier, k may be much smaller than n and 
thus it is more interesting to see whether we can provide better upper bounds 
in terms of k. 

For the list model, the algorithm of Bar-Noy et al. [S] is 0(l)-competitive, 
independent of n and k. For the interval model, the algorithm in Sect. [2T] can 
be extended to produce an 0(log fc)-competitive algorithm, matching the lower 
bound. For the tree model and the arbitrary restriction model, a trivial algorithm 
which always assign a job to a machine in the largest specified cluster is (fc-l- 1)- 
competitive. However, it is not clear how we can provide better upper bounds. 

In this section we show that the algorithm of Azar et al. |7| can be gene- 
ralized to the cluster-based arbitrary assignment restriction model, producing 
an 0(-\/i?)-competitive algorithm, where K = n/smin and Smin is the size of 
the smallest cluster. Thus in the case when all clusters are of similar sizes, it is 
0(\/fc)-competitive. We also show an 0(-\/fc)-competitive algorithm that works 
in the special case where the clusters are organized as a two- level tree. 



3.1 Arbitrary Assignment Restriction Model 

In this section we study the cluster-based arbitrary assignment restriction model 
in which each machine belongs to one of k clusters and each job can request an 
arbitrary subset of clusters. Denote Smin as the size of the smallest cluster and let 
K = n/smin- The lower bound on competitive ratio given in j5] can be expressed 
as We extend the algorithm Robin-Hood introduced by Azar et al. |2] 

to work in this model, resulting in an 0(-\/A)-competitive algorithm Cluster. 

To simplify our discussion, we assume that Cluster knows a value OPT 
specifying the maximum load generated by the optimal off-line algorithm. With 
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the doubling technique [2j, Cluster can be converted into an algorithm that 
does not know OPT in advance (instead, it is approximated dynamically). This 
conversion incurs only a degradation factor of 4 to the competitive ratio. 

At any time, a cluster is said to be overloaded if its average load is greater 
than \/K opt under Cluster. For any overloaded cluster, define its windfall 
time to be the last moment it became overloaded. 

When a job arrives. Cluster chooses a cluster for the job as follows. If 
possible, assign the job to a cluster that is not overloaded. Otherwise, assign it 
to the cluster with the greatest windfall time. Whenever a job is assigned to a 
cluster, the machine with the lowest load in that cluster is chosen. 

Theorem 5. Given an accurate OPT, the load of any machine under Cluster 
at any time is at most (f2y/K +2) OPT . Thus, Cluster is {2^/K +2) -competitive. 

Proof. The proof is similar to that of Robin-Hood. The detail is omitted. 

3.2 Two- Level Trees 

In this section we present a simple 0(\/fc)-competitive algorithm for the two-level 
cluster model, in which the clusters form a tree consisting of two levels. More 
precisely, the machines are partitioned into k — 1 leaf clusters Si (1 < i < k — 1) 
and a root cluster Sq. Each job specifies one of Si containing machines which can 
be used to serve the job, but the algorithm may instead use a machine in Sq. The 
result in Sect. E2] can be adapted to this model, giving an Q{'/k) lower bound 
on the competitive ratio. Here, we present an algorithm called TwoLevel with 
a matching upper bound. 

We assume that TwoLevel knows OPT, the maximum load generated by 
the optimal off-line algorithm. As in Sect. 13.11 this assumption can easily be 
removed. If a job arrives which requests a leaf cluster containing a machine with 
less than \/fc OPT load, TwoLevel assigns the job to that machine. Otherwise, 
TwoLevel assigns it to a machine in Sq with the lowest current load. 

Theorem 6. Given an accurate OPT, TwoLevel is {\/k-\- 2) -competitive. 

Proof. Since no job creates a load of more than OPT, each machine in a leaf 
cluster has a load of at most {^/k-\-l) OPT. The remainder of the proof establishes 
that no machine of the root cluster gets a load of more than {Vk + 2) OPT. 

Let Si denote the number of machines in Si. Consider any particular time t. 
Denote ni{t) and fi{t) as the total load of machines in So at t due to jobs 
requesting Si under TwoLevel and the optimal off-line algorithm respectively. 

Note that, at the last time when Uiff) increases, all machines in Si must 
have load at least \/fc OPT. The off-line algorithm must accommodate the sum 
of load in these machines, i.e., 

SiVit OPT -\- Uift) < (sq + Si) OPT for 1 < i < k — 1. (1) 

On the other hand, the off-line algorithm assigns a load of at least ni{t) to 
machines in So or Si. 



ni{t) < SiOPT -f fi{t) for 1 < i < k — 1. 



( 2 ) 



no 
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Eliminating Si, summing over i and adding the equality no{t) = fo(t), we have 

fc-i fc-i 

Vk'^Uiit) <So{k-l)OPT + {y/k-l) T.fM ■ (3) 

z— 0 z— 0 

Note that J2i=o — sqOPT. As a result, y/k J2i=o ^ {k + '/k — 
2 )sqOPT; thus, Z {Vk + l)soOPT. Since TwoLevel always as- 

signs a job to a machine with the lowest load when using Sq, the load at t is at 
most {Vk + 2) OPT for any machine in S'o. □ 

4 Virtual Circuit Routing 

In this section we study the virtual circuit routing problem. We are given a 
directed graph with edge set E of m edges. Each edge e is associated with a 
capacity Cg. Jobs arrive at unpredictable time, each requesting a route from a 
source to a destination. We need to find a path from the source to the destination, 
thereby increasing the load of each edge e in the path by wjce, where w is the 
weight specified by the job. Our objective is to minimize, over all time, the 
maximum load on any edge. 

In 1^, Aspnes et al. studied the problem with the assumption that jobs 
never depart. With this assumption, they gave an 0(log m)-competitive on-line 
algorithm and proved that no on-line algorithm can have competitive ratio better 
than O(logm). We present an on-line algorithm called VC-Routing for the 
more general case where jobs may depart at unpredictable time. It is 
competitive, where W = Ye^E Ce/cmin and Cmin is the minimum capacity of the 
edges. When all edges have identical capacity, the competitive ratio is equivalent 
to 0{vn?/^). We also prove that any on-line algorithm for the problem is l7(\/m)- 
competitive, even if all the edges have identical capacity. 

4.1 The Algorithm 

VC-Routing is a novel adaptation of Robin-Hood, the algorithm achieving 
optimal competitive ratio for the machine load balancing problem with arbitrary 
assignment restriction. The main challenge in the virtual circuit routing problem 
is that the length of the path chosen by the off-line algorithm may be different 
from that of the path chosen by the on-line algorithm. As a result, the aggregate 
load generated by the on-line algorithm can be much more than that generated by 
the optimal off-line algorithm. In order to control the difference of the aggregate 
load, VC-Routing takes into account the length of paths when assigning a job. 
Roughly speaking, it prefers relatively short paths, and applies a strategy similar 
to Robin-Hood only to those short paths. The details are as follows. 

We assume that VC-Routing knows a value OPT specifying the maximum 
load generated by the optimal off-line algorithm. As in Sect. IJ.1L such an algo- 
rithm can be converted to one which does not need the value of OPT. 

At any particular time, we say an edge is overloaded if its current load is 
greater than OPT. An overloaded path is a path which contains an over- 

loaded edge. For an overloaded edge, we define its windfall time as the last 
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moment it became overloaded. The windfall time of an overloaded path is the 
minimum windfall time of the overloaded edges on the path. We say a path is 
short if its length is no more than it is medium if its length is in the range 

otherwise, it is long. For every job j with weight w(j), we say a 
path is eligible if every edge on it satisfies w{j)/ce < OPT. 

When a new job arrives, VC-Routing selects a path among the eligible 
paths in the order of short, medium and finally long paths. Ties among short 
paths are broken as follows. VC-Routing selects one that is not overloaded if 
it exists; otherwise, it selects one with maximum windfall time. For medium and 
long paths, ties are broken arbitrarily. 

We analyse the competitive ratio of VC-Routing by bounding the load 
on any edge. For edges which are not overloaded, this is trivially bounded by 
W’^I^OPT . To bound the load on an overloaded edge Co, we partition the jobs 
assigned to Co into three sets, depending on the arrival time of the jobs and on 
how the off-line algorithm assigns the jobs. Let f7i(eo) be the partition containing 
jobs which are assigned to Cq at or before the windfall time of Cq, let Ji{e^ be 
the partition containing those remaining jobs which the off-line algorithm assigns 
to either a medium or a long path, and let f73(eo) be the partition containing the 
other jobs. We show the competitive ratio of VC-Routing using the following 
lemma. The proofs will appear in the full version of this paper. 

Lemma 7. The total weight of jobs in J 2 {e-o) and J^{eo) are both at most 
W^/^OPTc^,^. 

Theorem 8. Given an accurate OPT, the load of any edge under VC-Routing 
at any time is at most -I- l)OPT. Thus VC-Routing is -I- 1)- 

competitive. 

4.2 Lower Bound 

We show an lower bound on the 

competitive ratio of any on-line algorithm 

for the virtual circuit routing problem. 

Consider the graph in Fig. 2, which has 

4r nodes and 3r^ -|- r edges with identical 

capacity. The set of source nodes and the 

set of destination nodes form a complete 

bipartite graph. Each of these sets form a 

bipartite graph with a set of r intermediate 

nodes. The two sets of intermediate nodes 

form a bipartite matching. Let E' be the 

set of edges connecting the intermediate 

nodes. , , , , 

; ; r • u Fie. 2. An S2{Om) lower bound. 

We construct a sequence of jobs with ^ vv y 

r^ phases. In each phase, r jobs of unit weight are released with a distinct pair 

of source and destination. If the on-line algorithm assigns all these jobs to the 

edge connecting the source and the destination, we are done. Otherwise, at least 

one of these jobs is assigned to a path containing an edge in E'. At the end of 

this phase, all jobs except this departs. After phases, jobs remain, each 
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increases the load of one edge in E' . Since there are r edges in E' , at least one of 
them has a load of r. On the other hand, in an off-line assignment, the job that 
never departs is assigned to the edge connecting the source and the destination 
involved. The maximum load is 1 at all time. Therefore, we have the following 
theorem. 

T heore m 9. No on-line algorithm for the virtual circuit routing problem is 
{^ym/5 — 1)- competitive. This is true even if all edges have identical capacity. 
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Abstract. We consider online routing strategies for ronting between 
the vertices of embedded planar straight line graphs. Our results include 
(1) two deterministic memoryless ronting strategies, one that works for 
all Delannay triangulations and the other that works for all regular tri- 
angulations, (2) a randomized memoryless strategy that works for all 
triangulations, (3) an 0(1) memory strategy that works for all convex 
subdivisions, (4) an 0(1) memory strategy that approximates the short- 
est path in Delaunay triangulations, and (5) theoretical and experimental 
results on the competitiveness of these strategies. 



1 Introduction 

In this paper we consider online routing in the following abstract setting: The 
environment is a planar straight line graph [12], T, with n vertices, whose edges 
are weighted by the Euclidean distance between their endpoints, the source v^rc 
and destination Vdst are vertices of T, and a packet can only move on edges of 
T. Initially, a packet only knows Vsrc^ Vdst, and N(vsrc), where N (v) denotes the 
set of vertices adjacent to v. 

We classify online routing strategies based on their use of memory and/or 
randomization. Define Vcur as the vertex at which the packet is currently stored. 
A routing strategy is called memoryless if the next step taken by a packet de- 
pends only on Vcur, Vdst, and N{vcur)- A strategy is randomized if the next step 
taken by a packet is chosen randomly from N{vcur)- A randomized strategy is 
memoryless if the distribution used to choose from N(vcur) is a function only of 
'^CUVi Vdst, and N{ '^cur)- 

For a strategy S we say that a graph defeats S if there is a source/destination 
pair such that a packet never reaches the destination when beginning at the 
source. If S finds a path P from Vsrc to Vdst we call P the S path from Vsrc to 
Vdst- Here we use the term path in an intuitive sense rather than a strict graph 
theoretic sense, since P may visit the same vertex more than once. 

In this paper we also consider, as a special case, a class of “well-behaved” 
triangulations. The Voronoi diagram [11] of S' is a partitioning of space into cells 
such that all points within a Voronoi cell are closer to the same element p G S 
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than any other point in S. The Delaunay triangulation is the straight-line face 
dual of the Voronoi diagram, i.e., two points in S have an edge between them in 
the Delaunay triangulation if their Voronoi regions have an edge in common. 

In this paper we consider several different routing strategies and compare 
their performance empirically. In particular, we describe (1) a memory less s- 
trategy that is not defeated by any Delaunay triangulation, (2) a memoryless 
strategy that is not defeated by any regular triangulation, (3) a memoryless 
randomized strategy that uses 1 random bit per step and is not defeated by 
any triangulation, (4) a strategy that uses 0(1) memory that is not defeated 
by any convex subdivision, (5) a strategy for Delaunay triangulations that uses 
0(1) memory in which a packet never travels more than a constant times the 
Euclidean distance between Vsrc and Vdsu and (6) a theoretical and empirical 
study of the quality (length) of the paths found by these strategies. 

The first four routing strategies are described in Section 2. Section 3 presents 
theoretical and empirical results on the length of the paths found by these s- 
trategies and describes our strategy for Delaunay triangulations. A discussion 
of related previous work is provided in Section 4. Finally, Section 5 summarizes 
and describes directions for future research. 

Due to space constraints, some proofs and figures are omitted from this ex- 
tended abstract. The interested reader is referred to the full version of the paper 
[3]. 

2 Four Simple Strategies 

In this section we describe four online routing strategies and prove theorems 
about which types of graphs never defeat them. We begin with the simplest 
(memoryless) strategies and proceed to the more complex strategies. 

2.1 Greedy Routing 

The greedy routing (GR) strategy al- 
ways moves the packet to the neigh- 
bor gdy{vcur) of Vcur that minimizes 
dist{gdy{vcur),Vdst), where dist{p,q) de- 
notes the Euclidean distance between p and 
q. In the case of ties, one of the vertices 
is chosen arbitrarily. The greedy routing 
strategy can be defeated by a triangula- 
tion T in two ways: (1) the packet can get 
trapped moving back and forth on an edge 
of the triangulation (Fig. 1 (a)), or (2) the 
packet can get trapped on a cycle of three 
or more vertices (Fig. 1 (b)). However, as 
the following theorem shows, neither of these situations can occur if T is a De- 
launay triangulation. 




Fig. 1. Triangulations that defeat 
the greedy routing strategy. 
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Theorem 1 There is no point set whose Delaunay triangulation defeats the 
greedy routing strategy. 



2.2 Compass Routing 



The compass routing (CR) strategy always moves the packet to the vertex 
cmp{vcur) that minimizes the angle Ivdst^Vcur, cmp{vcur) over all vertices ad- 
jacent to Vcur- Here the angle is taken to be the smaller of the two angles as 
measured in the clockwise and counterclockwise directions. In the case of ties, 
one of the (at most 2) vertices is chosen using some arbitrary deterministic rule. 

One might initially believe (as we did) that com- 
pass routing can always be used to find a path be- 
tween any two vertices in a triangulation. However, 
the triangulation in Fig. 2 defeats compass routing. 
When starting from one of the vertices on the outer 
face of T, and routing to Vdst, the compass routing 
strategy gets trapped on the cycle shown in bold. 
The following lemma shows that any triangulation 
that defeats compass routing causes the packet to 
get trapped in a cycle. 




Fig. 2. A triangulation 
that defeats the compass 
routing strategy. 



Lemma 1 Let T he a triangulation that defeats com- 
pass routing, and let Vdst he a vertex such that com- 
pass routing fails to route a packet to Vdst when given 
some other vertex as the source. Then there exist- 
s a cycle C = vq, . . . ,Vk-i (k > 3) in T such that 
cmp{vi) = Vi+i for all 0 < i < k.^ 



We call such a cycle, C, a trapping cycle in T for Vdst- Next we characterize 
trapping cycles in terms of a visibility property of triangulations. Let ti and 
t 2 be two triangles in T. Then we say that ti obscures t 2 if there exists a ray 
originating at Vdst that strikes t\ first and then t 2 . Let u and v be any two 
vertices of T such that cmp{u) = v. Then define Auv as the triangle of T that 
is contained in the closed half-plane bounded by the line through uv and that 
contains Vdst- We obtain the following useful characterization of trapping cycles. 



Lemma 2 Let T he a triangulation that defeats compass routing and let C = 
vq, . . . ,Vk-i he a trapping cycle in T for vertex Vdst- Then AviVi+i is either 
identical to, or obscures Avi-iVi, for all 0 < i < k. 

A regular triangulation [13] is a triangulation obtained by orthogonal projec- 
tion of the faces of the lower hull of a 3-dimensional polytope onto the plane. 
Note that the Delaunay triangulation is a special case of a regular triangulation 
in which the vertices of the polytope all lie on a paraboloid. Edelsbrunner [7] 
showed that if T is a regular triangulation, then T has no set of triangles that 



^ Here and henceforth, all subscripts are assnmed to be taken modfe. 
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obscure each other cyclically from any viewpoint. This result, combined with 
Lemma 2, yields our main result on compass routing. 

Theorem 2 There is no regular triangulation that defeats the compass routing 
strategy. 



2.3 Randomized Compass Routing 

In this section, we consider a randomized routing strategy that is not defeated by 
any triangulation. Let cw(v) be the vertex in N(v) that minimizes the clockwise 
angle Iv dst , v , cw (v) and let ccw(v) be the vertex in N(v) that minimizes the 
counterclockwise angle Lvdst^v, ccw{v) . Then the randomized compass routing 
(RCR) strategy moves the packet to one of {cw{vcur), ccw{vcur)} with equal 
probability. 

Before we can make statements about which triangulations defeat random- 
ized compass routing, we must define what it means for a triangulation to defeat 
a randomized strategy. We say that a triangulation T defeats a (randomized) 
routing strategy if there exists a pair of vertices Vgrc and Vdst of T such that a 
packet originating at Vsrc with destination Vdst has probability 0 of reaching Vdst 
in any finite number of steps. 

Note that, since randomized compass routing is memoryless, proving that 
a triangulation T does not defeat randomized compass routing implies that a 
packet reaches its destination with probability 1. The following theorem shows 
the versatility of randomized compass routing. 

Theorem 3 There is no triangulation that defeats the randomized compass rout- 
ing strategy. 

Proof. Assume, by way of contradiction that a trian- 
gulation T exists that defeats the randomized com- 
pass routing strategy. Then there is a vertex Vdst 
of T and a minimal set S of vertices such that: 

(1) Vdst ^ S, (2) the subgraph H of T induced by 
S is connected, and (3) for every v € S, cw(v) € S 
and ccw(v) € S. 

Refer to Fig. 3 for what follows. The vertex Vdst 
lies in some face F oi H . Let u be a vertex on the 
boundary of F such that the line segment (u, Vdst) is 
contained in F . Such a vertex is guaranteed to exist 
[5]. The two neighbours of v on the boundary of F 
must be cw{v) and ccw{v) and these cannot be the same vertex (since F contains 
(vjVdst) in its interior). Note that, by the definition of cw{v) and ccw(v), and 
by the fact that T is a triangulation, the triangle (cw(v),v, ccw(v)) is in T. But 
this is a contradiction, since then v is not on the boundary of F. □ 




Fig. 3. The proof of The- 
orem 3. 
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2.4 Right-Hand Routing 

The folklore “right-hand rule” for exploring a maze states that if a player in 
a maze walks around never lifting her right-hand from the wall, then she will 
eventually visit every wall in the maze. More specifically, if the maze is the face 
of a connected planar straight line graph, the player will visit every edge and 
vertex of the face [2]. 

Let T be any convex subdivision. Consider the planar subdivision T' obtained 
by deleting from T all edges that properly intersect the line segment joining Vsrc 
and Vdst- Because of convexity, T' is connected, and Vsrc and Vdst are on the 
boundary of the same face F oiT' . The right-hand routing (RHR) strategy uses 
the right-hand rule on the face F to route from Vsrc to Vdst- Right-hand routing 
is easily implemented using only 0(1) additional memory by remembering Vsrc, 
Vdst, and the last vertex visited. 

Theorem 4 There is no convex subdivision that defeats the right-hand routing 
strategy. 



3 Competitiveness of Paths 

Thus far we have considered only the question of whether routing strategies can 
find a path between any two vertices in T. An obvious direction for research is 
to consider the length of the path found by a routing strategy. We say that a 
routing strategy is c-competitive for T if for any pair {vsrc, Vdst) in T, the length 
(sum of the edge lengths) of the path between Vgrc and Vdst found by the strategy 
is at most c times the length of the shortest path between Vsrc and Vdst in T. 
In the case of randomized strategies, we use the expected length of the path. A 
strategy has a competitive ratio of c if it is c-competitive. 

This section addresses questions about the competitive ratio of the strategies 
described so far, as well as a new strategy specifically targeted for Delaunay 
triangulations. We present theoretical as well as experimental results. 



3.1 Negative Results 

It is not difficult to contrive triangulations for which none of our strategies are 
c-competitive for any constant c. Thus it is natural to restrict our attention to 
a well behaved class of triangulations. Unfortunately, even for Delaunay trian- 
gulations none of the strategies described so far are c-competitive. 

Theorem 5 There exists Delaunay triangulations for which none of the greedy, 
compass, randomized compass, or right-hand routing strategies are c- competitive 
for any constant c. 
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3.2 A c-Competitive Strategy for Delaunay Triangulations 

Since none of the strategies described in Section 2 is competitive, even for De- 
launay triangulations, an obvious question is whether there exists any strategy 
that is competitive for Delaunay triangulations. In this section we answer this 
question in the affirmative. Our strategy is based on the remarkable proof of 
Dobkin et al [6] that the Delaunay triangulation approximates the complete Eu- 
clidean graph to within a constant factor in terms of shortest path length. In the 
following we will use the notation x(p) (resp. y(p)) to denote the x-coordinate 
(resp. y-coordinate) of the point p, and the notation |A| to denote the Euclidean 
length of the path X. 

Consider the directed line segment from v^rc to Vdst- This segment intersects 
regions of the Voronoi diagram in some order, say i?o, • ■ • where Rq is 

the Voronoi region of Vsrc and Rm-i is the Voronoi region of Vdst- The Voronoi 
routing (VR) strategy for Delaunay triangulations moves the packet from Vgrc 
to Vdst along the path vq, , Vm-i where Vi is the site defining Ri. An example 
of a path obtained by the Voronoi routing strategy is shown in Fig. 4. Since the 
Voronoi region of a vertex v can be computed given only the neighbours of v 
in the Delaunay triangulation, it follows that the Voronoi routing strategy is an 
0(1) memory routing strategy. 




Fig. 4. A path obtained by the Voronoi routing strategy. 

The Voronoi routing strategy on its own is not c-competitive for all Delaunay 
triangulations. However, it does have some properties that allow us to derive a 
c-competitive strategy. As with right-hand routing, let T' be the graph obtained 
from T by removing all edges of T that properly intersect the segment {vgrc, Vdst), 
and let F be the face of T' that contains both Vsrc and Vdst- Assume wlog that 
Vsrc and Vdst both he on the x— axis and that ^(^Vsrc ) < x{vdst)- The following 
results follow from the work of Dobkin et al [6] . 

Lemma 3 The Voronoi path is x-monotone, i.e., x(vi) < x(vj) for all i < j. 

Lemma 4 Let P' he the collection of maximal subpaths of Vg, . . . ,Vm-i that 
remain above the x-axis, i.e., P' = {vi,...,Vj : y(ui_i) < 0 and y{vj+\) < 
0 and y(ufc) > 0 for all i < k < j}. Then — {'^/‘^)dist{vsrc,Vdst)- 

Lemma 5 Select any i such that y{vi) > 0 and y(xj+i) < 0. Let j > i be the 
least value such that y(vj) > 0 and let Cdfs = (1 + \/5)|.^ Then the length of the 

^ We call Cdfs the Dobkin, Friedman and Supowit constant [6]. 
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path Vi, . . . ,Vj is at most Cdfs(p^{vj) — x(fi)) or the length of the upper boundary 
of F between Vi and Vj is at most Cdfs(p^{vj) — x(t;i)). 

Our c-competitive routing strategy will visit the subsequence bo, , bi-i of 
vertices vq, . . . ,Vm-i that are above or on the segment {vgrcVdst)- (Refer to 
Fig. 4.) If bi and 6i+i are consecutive on the Voronoi path (i.e., bi = Vj and 
bi+i = Vj+i for some j) then our strategy will use the Voronoi path (i.e, the 
direct edge) from bi to &i+i. On the other hand, if bi and bi+i are not consecutive 
on the Voronoi path then by Lemma 5, there exists a path from bi to of 
length at most Cd/s(x(6i+i) - x(6i)). 

The difficulty occurs because the strategy does not know beforehand which 
path to take. The solution is to simulate exploring both paths “in parallel” and 
stopping when the first one reaches 6i+i.^ 

More formally, let Py = (_Po = bi, . . . ,pn = h+i) be the Voronoi path from 
bi to bi+i and let Pp = {qo = bi, . . . ,Qn = &i+i) be the path from bi to on 
the upper boundary of F. The strategy is described by the following algorithm. 

1: j ^ 0, lo ^ mm{ dist (po,pi), dist {go, qi)}. 

2: repeat 

3: Explore Pp until reaching bipi or until reaching a vertex such that 

|(jo, • • ■ , qx+i\ > ‘^Ij. If bi+i is reached quit, otherwise return to bi. 

4: j ^ j + 1) Ij ^ l<Z05 ■ • ■ ,qy+i\- 

5: Explore Py until reaching bipi or until reaching a vertex py such that 

\po, . . . ,Py+i\ > ‘^Ij. If bi+i is reached then quit, otherwise return to bi. 

6: j ^ j + 1) Ij ^ bo, • • • ,Py+l\- 

7: until bipi is reached 

Lemma 6 Using the parallel search strategy described above, a packet reaches 
bi+i after traveling a distance of at most 9cd/s(x(5i+i) —x(bi)) ~ 45.75(x(6i+i) — 
x(6,)). 

Given the positions of Vdst and Vsrc the parallel search strategy described 
above is easily implemented as part of an 0(1) memory routing strategy. We 
refer to the combination of the Voronoi routing strategy with the parallel search 
strategy described above as the parallel Voronoi routing (PVR) strategy. 

Theorem 6 The parallel Voronoi routing strategy is {9cdfs + tt / 2) -competitive 
for all Delaunay triangulations. 

Proof. The strategy incurs two costs: (1) the cost of traveling on subpaths of 
the Voronoi path that remain above the y-axis, and (2) the cost of applica- 
tions of the parallel search strategy. By Lemma 4, the first cost is at most 
(7 t/2) • dist{vsrc, Vdst)- By Lemma 6 and the fact that bo, ... , 6;_i is x-monotone 
(Lemma 3), the cost of the second is at most 9cdfs ■ dist{vsrc,Vdst)- Since the 
Euclidean distance between two vertices of T is certainly a lower bound on their 
shortest path distance in T the theorem follows. □ 

® A similar strategy for finding an unknown target point on a line is given by 
Baeza- Yates et al [1]. See also Klein [8]. 
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3.3 Empirical Results 



While it is sometimes possible to come up with pathological examples of tri- 
angulations for which a strategy is not competitive, it is often more reasonable 
to use the competitive ratio of a strategy on average or random inputs as an 
indicator of how it will perform in practice. In this section we describe some ex- 
perimental results about the competitiveness of our strategies. All experiments 
were performed on sets of points randomly distributed in the unit square, and 
each data point is the maximum of 50 independent trials. 

The first set of experiments, shown in Fig. 5 (a), involved measuring the 
performance of all six routing strategies on Delaunay triangulations. Compass 
routing, greedy routing, and Voronoi routing consistently achieve better compet- 
itive ratios, with greedy routing slightly worse than the other two. Randomized 
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Fig. 5. Empirical competitive ratios for (a) Delaunay triangulations and (b) Graham 
triangulations. 

compass routing, right-hand routing, and parallel Voronoi routing had signif- 
icantly higher competitive ratios. The results for randomized compass routing 
and right-hand routing show a significant amount of jitter. This is due to the fact 
that relatively simple configurations^ that can easily occur in random point sets, 
result in high competitive ratios for these strategies. On the other hand, parallel 
Voronoi routing seems much more stable, and achieves better competitive ratios 
in practice than its worst case analysis would indicate. 

The most important conclusion drawn from these experiments is that there 
are no simple configurations (i.e., that occur often in random point sets) that 
result in extremely high competitive ratios for greedy, compass, Voronoi, or 
parallel Voronoi routing in Delaunay triangulations. This suggests that any of 
these strategies would work well in practice. 

The four simple routing strategies of Section 2 were also tested on Graham 
triangulations. These are obtained by first sorting the points by x-coordinate 
and then triangulating the resulting monotone chain using a linear time algo- 
rithm for computing the convex hull of a monotone polygonal chain [12]. The 

These configurations come up in the proof of Theorem 5 [3]. 
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results are shown in Fig. 5 (b). In these tests it was always the case that at least 
one of the 50 independent triangulations defeated greedy routing. Thus, there 
are no results shown for greedy routing. The relative performance of the com- 
pass, randomized compass, and right hand routing strategies was the same as 
for Delaunay triangulations. However, unlike the results for Delaunay triangu- 
lations, the competitive ratio appears to be increasing linearly with the number 
of vertices. 

4 Comparison with Related Work 

In this section we survey related work in the area of geometric online routing, 
and compare our results with this work. Due to space constraints, we can not 
provide a comprehensive list of references in this version of the paper. A more 
complete survey is available in the full version of the paper [3]. 

Kranakis et al [9] study compass routing, and provide a proof that no De- 
launay triangulation defeats compass routing. The current paper makes use of a 
very different proof technique to show that compass routing works for a larger 
class of triangulations. They also describes an 0(1) memory routing strategy 
that is not defeated by any connected planar graph, thus proving a stronger 
result than Theorem 4. 

Lin and Stojmenovic [10] and Bose et al [4] consider online routing in the 
context of ad hoc wireless networks modeled by unit disk graphs. They pro- 
vide simulation results for a variety of strategies that measuring success rates 
(how often a packet never reaches its destination) as well as hop-counts of these 
strategies on unit graphs of random point sets. 

To the best of our knowledge, no literature currently exists on the competi- 
tiveness of geometric routing strategies in our abstract setting, and our parallel 
Voronoi routing strategy is the first theoretical result in this area. 

5 Conclusions 

We have studied the problem of online routing in geometric graphs. Our theo- 
retical results show which types of graphs our strategies are guaranteed to work 
on, while our simulation results rank the performance of the strategies on two 
types of random triangulations. These results are summarized in the following 
table. 



Strategy 


Memory 


Randomized 


Class of graphs 


Rank 1 


Rank 2 


Competitive 


GR 


None 


No 


Delaunay A’s 


3 


- 


No 


CR 


None 


No 


Regular A’s 


1 


1 


No 


RCR 


None 


Yes 


All A’s 


5 


2 


No 


RHR 


0(1) 


No 


Convex subd. 


6 


3 


No 


VR 


0(1) 


No 


Delaunay A’s 


1 


- 


No 


PVR 


0(1) 


No 


Delaunay A’s 


4 


- 


Yes 
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Abstract. We study program result checking using AC° circuits as 
checkers. We focus on the number of queries made by the checker to 
the program being checked and we term this as the query complexity of 
the checker for the given problem. We study the query complexity of 
deterministic and randomized AC° checkers for certain P-complete and 
NC^-complete problems. We show that for each e > 0, Q{'n}~'^) is a lo- 
wer bound to the query complexity of deterministic AC° checkers for the 
considered problems, for inputs of length n. On the other hand, we show 
that suitably encoded complete problems for P and have randomi- 
zed AC° checkers of constant query complexity. The latter results are 
proved using techniques from the PCP(n^, 1) protocol for 3-SAT in [4]. 



1 Introduction 

In this paper we study program result checking (in the sense of Blum and 
Kannan |^) with AC° circuits as checkers and we focus on the number of queries 
made by the checker to the checked program. We term this parameter as the 
query complexity of the checker for the given problem. The query complexity 
is an important parameter in the design of efficient program checkers because 
a large query complexity can be a serious bottleneck for a checker that may 
otherwise be efficient. 

The seminal paper of Blum and Kannan [H] already initiates the study of par- 
allel checkers. They give a deterministic CRCW PRAM constant-time (i.e. AC°) 
program checker for the P-complete problem LFMIS (lex. first maximal inde- 
pendent set problem for graphs). Rubinfeld in [10] makes a comprehensive 
algorithmic study of parallel program checkers: parallel checkers are designed 
in HD! for various problems with emphasis on analyzing parallel time and pro- 
cessor efficiency. In particular, an AC'^ checker for the P-complete problem of 
evaluating straight-line programs is described in m- However, in the context 
of the present paper, we note that the above-mentioned AC° checkers for P- 
complete problems described in IDUDI have large query complexity (the number 

* Part of the work done while at IMSc. 
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of queries is proportional to the input size). Indeed the AC° checkers in EEq] 
are deterministic, and as shown in the present paper deterministic AC'^ checkers 
for standard P-complete problems must necessarily have large query complexity. 

Regarding query complexity, we note that constant query checkers are men- 
tioned in [Tj as a notion of practical significance. For instance, it is shown in [1] 
that GCD has a constant query checker. To the best of our knowledge, there is 
no other study or explicit mention of query complexity of program checkers. 

Our motivation for studying AC° checkers is two-fold. First, in a complexity- 
theoretic sense AC° represents the easiest model of parallel computation, and one 
aspect of program checking is to try and make the checker as efficient as possible. 
In this sense AC° checkers can be seen as constant-time parallel checkers since 
they essentially correspond to constant-time GROW PRAM algorithms. The 
second motivation rests on the main goal of this paper, namely, to study the 
query complexity of program checkers. It turns out that AG° yields a model of 
program checking that is amenable to lower bound techniques. The problems 
we consider are different suitably encoded versions of the Circuit Value Problem 
(henceforth GVP). Different versions of the GVP are known to be complete for 
different important complexity classes: the unrestricted version is P-complete 
and the GVP problem for circuits that are formulas (i.e. fanout of each gate 
is one) is NG^-complete. In fact, using similar techniques we can also suitably 
encode an NL-complete problem that has a constant query randomized AG*^ 
checker. 

We deduce nontrivial lower bounds on the query complexity of deterministic 
AG° checkers for these circuit value problems: we prove the lower bounds by first 
noting for the Parity problem that, by a result of Ajtai [2], is a lower 

bound for the query complexity of deterministic AG'^ checkers for Parity. Since 
Parity is AG'^ many-one reducible to each considered circuit-value problem the 
same lower bounds on query complexity carry over. Thus, we get that for each 
e > 0, i7(n^“*^) is a lower bound to the query complexity of deterministic AG° 
checkers for these circuit- value problems. 

In contrast, we design randomized AG'’ checkers of constant query complexity 
for these circuit- value problems. We outline the ideas involved: consider a deter- 
ministic AG° checker for the given circuit-value problem (e.g. the one described 
in m for general straight-line programs) . The query complexity of this checker 
is roughly the number of gates in the input circuit. Our randomized constant 
query AG® checkers for the circuit-value problems use ideas from the PGP(n^, 1) 
protocol for satisfiability |1] . Intuitively, it turns out that the number of probes 
into an NP proof by a PGP protocol corresponds to the number of queries that 
the checker needs to make for a given instance of the circuit-value problem. A 
difficulty in the checker setting (which doesn’t arise in the PGP(n^, 1) protocol) 
is that the AG° checker needs to compute unbounded GF(2) sums. It cannot di- 
rectly do this because Parity is not computable in AG*’. But we can get around 
this difficulty by using Rubinfeld’s parallel checker for Parity m which we note 
is already an AG*’ constant-query checker. Since Parity is AG*’ many-one redu- 
cible to the considered circuit- value problems, we can use the tested program 



The Query Complexity of Program Checking by Constant-Depth Circuits 125 



for CVP to compute Parity and use Rubinfeld’s checker as subroutine to check 
that the returned answer is correct. Thus we are able to design a constant-query 
AC° checker for CVP. 

As explained in |4], the PCP theorem has evolved from interactive proofs |3 
and program checking | 5I6| . In particular, there is a strong influence of ideas 
from self-correcting programs in the PCP(n^, 1) protocol for 3-SAT [4j. It is not 
surprising, therefore, that ingredients of the PCP(n^,I) protocol find applica- 
tion in program checking. Our emphasis on the query complexity of program 
checkers leads naturally to ideas underlying probabilistically checkable proofs. 
More applications of ideas from the PCP theorem to other specific problems in 
program checking appears to be an attractive area worth exploring. 

2 Preliminaries 

We first formally define program checkers introduced in j^. 

Definition 1. Let A be a decision problem, a program checker for A, Ca, 
is a (probabilistic) oracle algorithm that for any program P (supposedly for A) 
that halts on all instances, for any instance x of A, and for any positive integer 
k (the security parameter) presented in unary: 

1. If P is a correct program, that is, if P{x) = A{x) for all instances x, then 
with probability 1, CA{x,P,k)=Correct. 

2. If P(x) yf A(a;) then with probability > 1 — 2“^, C a{x, P,k) =Incorrect. 

The probability is computed over the sequences of coin flips that Ca could have 
tossed. Importantly, Ca is allowed to make queries to the program P on some 
instances. 

When we speak of AC° checkers we mean that the checker Ca is described by a 
(uniform) family of AC° circuits, one for each input size. We will also consider the 
(stronger) notion of deterministic checkability. The decision problem A is said 
to be deterministically checkable if Ca in the above definition is a deterministic 
algorithm. Next we define the query complexity of AC° checkers. 

Definition 2. Let L be a decision problem that is deterministically AC*^ checka- 
ble. The AC° checker defined by the circuit family {Cn}n>o is said to have query 
complexity q(n) if q{n) bounds the number of queries made the checker circuit 
Cn for any input x € A". 

We now describe the CVP problems and the encodings of their instances. 
Let C denote a boolean circuit over the standard base (of NOT, AND, and 
OR gates). We consider circuits of fanin bounded by two. We will encode the 
circuit C as 4-tuples (51, 52 j 5s 1 1) where t is a constant number of bits to indicate 
the type of the gate labeled g\, and (72 and g^ are the gates whose values feed 
into the gate labeled g\. For uniformity, we can assume that NOT gates are 
also encoded as such 4-tuples, except that g 2 =53. Furthermore, we insist that 
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in the encoding, the gate labels gi be topologically sorted consistent with the 
DA G underlying the circuit C. Thus, in each 4-tuple (31,32, 53, present in the 
encoding of C, it will hold that 31 > 32 and 31 >33. This stipulation ensures 
that checking whether an encoding indeed represents a circuit can be done in 
AC°. For a circuit C with n inputs let Cg{x\,X2, • ■ • , Xn) denote the value of the 
circuit C at gate 3 designated as output gate. We now define the circuit value 
problem which is the decision problem that we shall be mainly concerned with 
in this paper: {{C,g,xi,X2, ■ ■ ■ ,x„) \ Cg{x\,X2, ■ ■ ■ ,Xn) = !}• The circuit C is 
encoded as described above. 

The above circuit value problem is known to be P-complete (under projection 
reducibility). We denote this by CVP. If the input circuit is a formula (i.e. each 
gate of the input circuit has fanout at most 1) the corresponding circuit value 
problem is known to be NC^-complete under projection reducibility; we denote 
this problem by FVP (for formula value problem) . Notice that an AC° circuit can 
check if a given circuit is a formula or not. Finally, we make another important 
stipulation on the circuits that are valid inputs for all the circuit value problems 
that we consider: we insist that the fanout of each gate is bounded by two. Notice 
that this last restriction on the input circuits does not affect the fact that CVP 
remains P-complete (in fact, such a restriction already holds for the CVP in the 
standard P-completeness proof by simulating polynomial-time Turing machines) . 
Also, observe that this extra stipulation on the input circuits can be easily tested 
in AC°. 

3 Deterministic AC° checkers 

We first recall deterministic AC° checkers for Parity and the circuit value pro- 
blem [sen]. 

Proposition 1. For each constant k, there is a deterministic AC° checker for 
Parity (xi, X2, . . . , x„) of query complexity nj log^ n. 

Proof. Let P be an alleged program for the Parity function. First observe that 
the AC° checker can make parallel queries to P for Parity(xi, X2, . . . , x^) for 
2 < i < n. In order to verify that the program’s value of Parity(xi, X2, . . . , x„) 
is correct the checker just has to verify that the answers to the queries for 
Parity(xi, X2, . . . , Xj) are all locally consistent: P(xi, X2, . . . , x^+i) = x^+i © 
D(xi, X2, . . . , Xi) for 2 < i < n — 1. The above verification can be easily done in 
parallel in AC° since query answers D(xi, X2, . . . , Xj) for 2 < * < n are available. 
This yields an AC° checker which makes n — 1 queries. In order to design a 
checker with the number of queries scaled down to n/log^ n notice that we can 
compute the parity of log n boolean variables in AC*^ by brute force. Thus, we 
can group the n input variables into n/ log n groups of log n variables each, com- 
pute the parity of each group again by brute force, and the problem boils down 
to checking the program’s correctness for the parity of n/logn variables which 
we can do as before with n/logn queries to the program. Clearly, we can repeat 
the above strategy of grouping variables for a constant number of rounds, and 
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therefore achieve the query complexity of n/ log* n with an appropriate constant 
increase of depth of the checker circuit. This completes the proof. 

Remark: Notice that the above result applies to checking iterated products over 
arbitrary finite monoids. The proof and construction of the checker is similar. 

Indeed, the above idea of doing piece-wise “local testing” in order to check 
global correctness is already used in the deterministic CRCW PRAM checker 
in [ 5 ] for an encoding of the P-complete problem LFMIS, and also in [ 111 ] for 
checking straight-line programs. Notice that the checker for general straight- 
line programs includes checking the circuit-value problem as a special case. We 
include a proof sketch below since it is the starting point for the main results of 
this paper. 

Theorem 1. uni CVP has a deterministic AC*^ checker. 

Proof. Let P be an alleged program for CVP and let {C, g,xi, . . . ,Xn) be an 
input instance for program P. The AC° checker first queries in parallel the 
program for P{C, g,Xi, . . . , Xn) V gates g € C Next, for each tuple {gi,g2, ffa, t) 
in the circuit description C the checker verifies that the program’s answers are 
consistent with the gate type. This is again done in parallel for each tuple. The 
checker must also validate the input by verifying that the tuples that describe 
the circuit indeed describe an acyclic digraph. This is made sure as described in 
our encoding of the instances of the CVP. It suffices to check that gi > g2 and 
9i > 93 for each tuple {gi, 92, 9s,t), which can be done in AC°. This completes 
the proof. 

It can be shown similarly that FVP has a deterministic AC° checker. We 
now turn to lower bounds on the query complexity of AC° checkers for CVP and 
FVP. We first observe the following property of languages L having deterministic 
AC° checkers. 

Lemma 1. Let L he a decision problem that is deterministically AC° checkable 
and has an AC° checker of query complexity q{n). Then, for each n > 0 , there is 
a nondeterministic AC*^ circuit that takes n input hits and q{n) nondeterministic 
hits and accepts an input x € A" iff x G . 

Observe that, by symmetry, such nondeterministic AC'^ circuits also exist 
for L. The proof of the above lemma is a direct consequence of the definition 
of deterministic checkers and is a variation of a result on self-helping due to 
Schoning HD: as observed e.g. in | 5 ], polynomial-time deterministic checking 
coincides with self-helping defined by Schoning Em who showed that languages 
that have self-helpers are already in NPflco-NP. The above lemma is an extension 
of this fact to the setting where the checker is in AC°. The only extra observation 
made in Lemma [D is that the number of queries made by the checker naturally 
translates into the number of nondeterministic bits used by the nondeterministic 
circuit. We next recall a result due to Ajtai [ 2 ] on lower bounds for AC*^ circuits 
approximating Parity. 
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Theorem 2. For all constants k, c, and e > 0, there is no depth k circuit 
of size n'^ that can compute Parity (a;i, a;2, . . . ,Xn) for more than a 1/2 + 
fraction of the inputs. Thus, no nondeterministic AC^ circuit with non- 

deterministic bits can compute Parity (a;i, X 2 , . • . , Xn). 



Lemma 2. Parity does not have deterministic AC° checkers that make 0{n^ 
queries for any e > 0. 

Proof. Assume that Parity has AC° checkers that makes queries for 

some e > 0. From Lemma [T] it follows that for each n > 0, there is a nondeter- 
ministic AC*^ circuit that takes n input bits and 0{n^~'^) nondeterministic bits 
and accepts x G if" iff x has odd parity. This is impossible since it contradicts 
Theorem 12] of Ajtai. 



Theorem 3. CVP (likewise FVP ) does not have deterministic AC° checkers 
that make queries for any e > 0. 

Proof. We prove it just for CVP. Notice that we can design an AC° circuit (call 
it C") such that given an instance x G if" of Parity the AC° circuit produces an 
instance (C, X\,X 2 , ■ . ■ ,Xn,g) of CVP such that Parity (xi, X2, . . . , x„) = 1 iff it 
holds that (C, xi, X2, . . . , Xn,g) G CVP. Moreover, the size of (C, xi, X2, . . . , x„, g) 
is 0(n log n), since C encodes the linear-sized circuit for Parity in the 4-tuple 
encoding we are using for CVP instances. Assume that CVP has an AC° checker 
that makes queries for some e > 0. Combining the nondeterministic cir- 

cuit given by Lemma [U with the AC*^ circuit C , we get a nondeterministic AC'^ 
circuit that takes n input bits and nondeterministic bits and accepts 

an input x G If" iff x has odd parity, for some suitable i5 > 0. This contradicts 
Lemma [2] and hence completes the proof. 

4 A constant qnery randomized AC° checker for CVP 

We first recall the relevant definition and results from [H] concerning the linearity 
test. 

Definition 3. Let F be GF(2) and f,g be functions from F" to F. The 
relative distance A{f,g) between f and g is the fraction of points in F" on 
which they disagree. If A{f,g) < 5 then f is said to be 5-close to g. 

Theorem 4. [S] Let F be GF(2) and f be a function from F" to F such that 
when we pick y, z randomly from F", Prob[f{y) f{z) = f{y z)] > 1 — <5, 
where 5 < 1/6. Then f is 35-close to some linear function. 

The theorem gives a linearity test that needs to evaluate / at only a constant 
number of points in F", where the constant depends on 5. If / passes the test 
then the function is guaranteed to be 3(5-close to some linear function. Given a 
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function / guaranteed to be 5-close to a linear function /, and x in the domain of 
/, let us denote by SC-/(x) the value f(x + r) — /(r), for a randomly chosen r in 
the domain of /. Then the above theorem guarantees that with high probability 
SC-f{x) is equal to f{x). 

Next we recall another lemma from j4] (as stated in 13 Lemma 7.13]). Given 
a in GF(2)", the outer product b=a^a is an nxn matrix over GF(2) such 
that — CLj^CLj . 

Lemma 3. [3] Let a G GF(2)" and b be an n x n matrix over GF(2). Suppose 
b ^ a ® a, then Prob[r‘(a ® a)s ^ r*6s] >1/4 where r and s are randomly 
chosen from GF(2)"’. 

A randomized AG*^ checker for Parity is described in [l3 based on the fact 
that Parity is random self-reducible. Additionally, we observe that this checker 
has constant query complexity and we will use it as subroutine in the checker 
for GVP. 

Theorem 5. [IHl Parity has a randomized AG*^ checker of constant query com- 
plexity. 

We are now ready to design the constant query checker for the GVP pro- 
blem (also for FVP). We make use of ideas in the PGP(n^, 1) protocol for 3-SAT 
from [3j. A crucial point of departure from |3| is when the checker needs to 
compute the parity of a multiset of input variables and a product of input varia- 
bles. To do this we use as subroutine the checker of Theorem [3 Another point 
to note is that all queries have to be valid instances of GVP (or FVP as the 
case may be), and they need to be generated in AG°. The starting point is the 
deterministic AG° checker for GVP described in Theorem [T] Recall that given 
instance (C, g,x\, . . . , Xn) the deterministic AG° checker queries the program for 
(C, gi,X\, . . . , Xn)i for each gate gi of C. Then it checks that the query answers 
are locally consistent for each gate. Let yi,y 2 , ■ ■ ■ ,ym be the query answers by P 
for the queries (C, gi,X\, . . . , Xn), 1 < i < m, where C has m gates. The unique 
correct vector j/i, ?/ 2 , • • ■ , 2 /m is a satisfying assignment to the collection of all 
the gate conditions (each of which is essentially a 3-literal formula). The idea 
is to avoid querying explicitly for yfs. Instead, using randomness the checker 
will make fewer queries for other inputs that encode the yfs. More precisely, we 
need to encode 2/1, 2/2, ■■■ 1 2/m in a way that making a constant number of queries 
to the program (which is similar to a constant number of probes into a proof 
by a PGP protocol) can convince the AG° checker with high probability that 
2/1) 2/2, • • ■ , 2 /m is consistent with all the gate conditions. 

Theorem 6. The P -complete problem GVP (likewise the -complete problem 
FVP ) has a randomized AG*^ checker of constant query complexity. 

Proof. We describe the checker only for GVP (the checker for FVP is simi- 
lar). Let P be a program for GVP and {C,g,Xi, . . . ,a;„) be an input. Suppose 
G has m gates gi,...,gm w.l.o.g. assume gm = g- The deterministic checker 
of Theorem [T] queries P for (C, (/i, xi, . . . , a:„), for each gi. It then performs 
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a local consistency check to test the validity of (C, gm,xi, . . . ,Xn)- The new 
checker must avoid querying explicitly for each yi := (C,gi,xi, . . . Let ti 
denote a GF(2) polynomial corresponding to the ith gate of the circuit C, where 
ti is a polynomial in at most three variables (these three variables are from 
{x \, . . . , Xn} U {yi, 2/2, ■ ■ • , 2/m})- Define U such that it is zero iff the correspon- 
ding variables are consistent with the gate type of gi . More precisely, let g be an 
AND gate with output a and inputs b and c then the polynomial corresponding 
to (/ is a -|- be. Similarly, for an OR gate with output a and inputs b and c the 
polynomial is a + b + c + be., and for a NOT gate with output a and input b it is 

CL b 1 . 

The checker 

The checker first queries the program on input {C,g,x\, . . . ,Xn) and sets 
2/m = P{C, g,x\,..., Xn). It suffices for the AC° checker to verify that the follo- 
wing linear function f(x,y,z) = E'^^ti{x,y)zi in the new variables Zi,l < i < 
m is the zero linear function. 

Notice that if this function is nonzero then the linear function f{x, y, z) eva- 
luated at a randomly chosen z := {zi, . . . , Zm) would be nonzero with probability 
1/2. If it were the zero function then it must be zero with probability 1. 

Recall that the checker must compute this value by asking a series of CVP 
queries that are generated in AC*^. Towards this end, we rewrite the above expres- 
sion: f{x, y, z) = p{x, z) + q{y, z) + r{y, z), where p(x, z) = + 

q{y,z) = EJE^^et * y„ and r{y,z) = T'(ij)6[m]x[m]Cij * 

ViV]- 

In the above expression y} is 1 iff Xi appears in the polynomial pk and if Zk 
is 1. Likewise is 1 iff XiXj appears in the polynomial pk and Zk is 1. From 
this it follows that an nm + n^m length Boolean vector representing each term 
in p can be obtained in AC*^. Notice that coefficients and in q and r depend 
upon gates gi and gj, the constant number of gates they feed into, the z values 
corresponding to these gates and a constant number of input bits. So computing 
each of these coefficients involves computing the parity of a constant number of 
Boolean variables. This can also be done in AC°j}| 

We describe below how the checker computes p, q, and r with high probability. 
To complete the checking the checker evaluates p + q + r and accepts P{x) as 
correct only if this sum is zero. 

Computing p Note that p is the parity of nm + n^m Boolean variables. As 
noted above the value of these variables can be obtained in AC° given x and 
z. The checker constructs a description of a canonical circuit for the parity 
of nm -|- n^m variables, and queries the program on this input. Next the AC® 
checker checks the answer of P using the checker of Theorem O as subroutine. 
If the answer is wrong then with high probability the subroutine checker will 
reject the program as incorrect. Thus the AC® checker computes a value p which 
is p with high (constant) probability. In the process only a constant number of 
queries are made to P. 

Notice that, unlike in [4j, we have to deal with both 2/1, • ,2/m well as 

X\,X 2 , . . ■ ,Xn which occur in the polynomials ti. The crucial difference between 

^ It is easier to first conceive of a constant time CRCW PRAM algorithm for this task. 
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the Xj’s and yj’s is that Xi, X 2 , ■ ■ ■ , Xn are bound to the input values. Thus, 
computing the value of p is a parity computation which the checker requires to 
get done. As explained above, this is done using the checker of Theorem 0 as 
subroutine. 

Computing q and r To compute q and r the checker goes through the following 
steps. 

1. It builds a circuit Ci with new inputs ri, r 2 , . . . , that computes the func- 
tion E'^^yiTi. Recall that yi is the output of the zth gate of the input circuit 
C on input a;i, X 2 , . . . , a;„. Clearly, the encoding of Ci can be generated by 
an AC° circuit from the encoding of C. 

2. Similar to the above step, the checker builds a circuit C 2 with new inputs 
fij, ^ < i, j < rn that computes E'^-^EJL-^yiyjVij. It is clear that an encoding 
for C 2 can also be generated by an AC° circuit. 

3. The checker verifies that the program’s behavior on Ci is a function that is 
(5-close to a linear function in the variables r^, and the program’s behavior 
on C 2 is a function that is (5-close to a linear function in the variables r^. 
This can be done as described in the previous section using Theorem [4l If 
either of the tests fails, it rejects the program as being incorrect. 

4. Like in the PCP protocol the checker now performs a consistency check. Let 
E^^EJl^bijVij be the linear function to which C 2 is 5-close. The checker does 
a constant query test, and ensures with high probability that the matrix bij 
is the tensor product of y with itself. 

To do this the checker employs the test given by Lemma E] For two randomly 
random vectors ri,r 2 of length m it verifies that SC-Ci(ri) * SC-C'i(r 2 ) = 
SC-C2(ri®r2) 

Note that the tensor product can be computed in AC° and the checker needs 
to ask 6 queries of P. 

Having performed the linearity and consistency tests the checker evaluates 
q and r by self-correction. Let c denote the m- vector Ci,C 2 ,...,Cm and let d 
denote the m x m- vector consisting of c^, 1 < i, j < m. The checker sets q to 
SC-Ci(c) and r to SC-C' 2 (d). 

Correctness. 

If P is correct for all inputs then with probability 1 the checker will pass P as 
correct. Suppose that P is incorrect on (C\g,xi, . . . ,Xn) and let 
P{C,g,xi, . . . ,Xn) = h. Let F be the event that the checker fails to detect 
the program as incorrect. Let T be the event that the checker passes the li- 
nearity and consistency tests done in the course of computing q and r. Let w 
dennote the concatenation of all random strings used by the checker. Notice that 
it suffices to bound Probu,[F | T] as Probu,[F’] < Probu,[F’ | T], Therefore, we 
can assume that C\ is 5-close to a unique linear function of the r^’s, E^^yiVi, 
wherein = b. Likewise C 2 computes a function 5-close to the linear function 
^{i,j)e[Tn]x[m]yiyj *Cij. Let y = (yi, . . . , y^) be this unique linear function. Given 
that P{x) is incorrect, the function f{xy, z) = E^-^^ti{x, y)zi is a nonzero linear 
function of the Zi’s. Hence, Probu,[/(a;, y, 2 :) = 1 | T] = ^. Now, 
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Probu, [F \T] = Probu, [F k f{x,y, z) = 1 \ T] + Probu, [F k f{x,y, z) = 0 \ T] 
< Prob„[F & f(x,y,z) = 1 | T] + 1/2 
= Probu,[/(a;, y,z) = 1 \ T]* Probu,[i^ | f{x, y,z) = 1 k T] + 1/2 
= 1/2 * Prob^[i^ I fix, y,z) = lkT] + 1/2 

Since f{x, y, z) = p(x, z)+q(y, z)+r(y, z) and F is the event that p+q + f = 
0, we observe: Probu,[F | f{x,y,z) = 1 k T] = Probu,[p p{x,z) \ T] + 
Prob^ [qf^q{y,z) \ T] + Prob^, [ff^r{y,z) \ T], 

From Theoreml^the first term is bounded by 3/4. Each of the other two terms 
is bounded by 2d. Putting it together we get, Probu,[F | T] < l/2=t=(3/4+45)+l/2. 
This is smaller than 15/16 if we choose 6 smaller than 1/32. 

Since the error in the checker is one-sided, we can easily make the error 
probability an arbitrarily small constant by repeating the checker a constant 
number of times in parallel and rejecting the program if in one such repetition 
the checker rejects. This completes the proof. 
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Abstract. Our main result shows that a shortest proof size of tree- like 
resolution for the pigeonhole principle is superpolynomially larger than 
that of DAG-like resolution. In the proof of a lower bound, we exploit 
a relationship between tree-like resolution and backtracking, which has 
long been recognized in this field bnt not been used before to give explicit 
results. 



1 Introduction 

A proof system is a nondeterministic procedure to prove the unsatisfiability of 
CNF formulas, which proceeds by applying (usually simple) rules each of which 
can be computed in polynomial time. Therefore, if there is a proof system which 
runs in a polynomial number of steps for every formula, then NP=coNP . Since 

this is not likely, it has long been an attractive research topic to find exponential 
lower bounds for existing proof systems. There are still a number of well-known 
proof systems for which no exponential lower bounds have been found, such as 
Frege systems [3]. 

Resolution is one of the most popular and simplest proof systems. Even so, 
it took more than two decades before Haken [Bj finally obtained an exponential 
lower bound for the pigeonhole principle. This settlement of the major open 
question, however, has stimulated continued research on the topic mm- The 
reason is that Haken’s lower bound is quite far from being tight and his proof, 
although based on an excellent idea later called bottleneck counting, is not so 
easy to read. 

Tree-like resolution is a restricted resolution whose proof must be given as 
not directed acyclic graph (DAG) but a tree. It is a common perception that 
tree-like proof systems are exponentially weaker than their DAG counterparts. 
Again, however, proving this for resolution was not easy: In [2], Bonet et al. 
showed that there exists a formula whose tree- like resolution requires 2^*^" ^ 
steps for some e, while steps suffice for DAG-like resolution. 

In this paper, we give such a separation between tree-like and DAG-like 
resolutions using the pigeonhole principle that is apparently the most famous 
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09480055 and 10205215. 

A. Aggarwal, C. Pandu Rangan (Eds.): ISAAC’99, LNCS 1741, pp. 133-yA^ 1999. 
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and well-studied formula. Our new lower bound for tree-like resolution is 
steps for the n -I- 1 by n pigeonhole formula. The best previous lower bound is 
2” [3j which is not enough for such a (superpolynomial) separation since the 
best known upper bound of D AG-like resolution is 0(n^2’^) [4]. Our new lower 
bound shows that tree-like resolution is superpolynomially slower than DAG-like 
resolution for the pigeonhole principle. 

Another contribution of this paper is that the new bound is obtained by 
fully exploiting the relationship between resolution and backtracking. This rela- 
tionship has long been recognized in the community, but it was informal and no 
explicit research results have been reported. This paper is the first to formally 
claim a benefit of using this relationship. Our lower bound proof is completely 
based on backtracking, whose top-down structure makes the argument surpri- 
singly simple and easy to follow. 

In Sec. 2, we give basic definitions and notations of resolution, backtracking 
and the pigeonhole principle. We also show the relationship between resolution 
and backtracking. In Sec. 3, we prove an lower bound of tree-like resolution 
for the pigeonhole principle. In Sec. 4, we give an upper bound 0(n^2"’) of the 
DAG-like resolution which is slightly better than 0(n^2") proved in [Ij. It should 
be noted that our argument in this paper holds also for a generalized pigeonhole 
principle, called the weak pigeonhole principle^ which is an m by n (m > n) 
version of the pigeonhole principle. Finally, in Sec. 5, we mention future research 
topics related to this paper. 

2 Preliminaries 

A variable is a logic variable which takes the value true (1) or false (0). A literal 
is a variable x or its negation x. A clause is a sum of literals and a CNF formula 
is a product of clauses. A truth assignment for a GNF formula / is a mapping 
from the set of variables in / into {0,1}. If there is no truth assignment that 
satisfies /, we say that / is unsatisfiable. 

The pigeonhole principle is a tautology which states that there is no bijection 
from a set of n-l- 1 elements into a set of n elements. is a GNF formula 

that expresses a negation of the pigeonhole principle; hence PHP^'^^ is unsa- 
tisfiable. PHPif'^^ consists of n{n + 1) variables Xij (1 < i < n -I- 1, 1 < j < n), 
and Xij = 1 means that i is mapped to j. There are two sets of clauses. The first 
part consists of clauses {xi^i + Xi ^2 -k • • • -I- for 1 < i < n -I- 1. The second 
part consists of clauses (xifk + where 1 < fc < n and I < i < j <n-|-l. 
Thus there are (n -k 1) -k ^(n^(n + 1)) clauses in total. 

Resolution is a proof system for unsatisfiable GNF formulas. It consists of 
only one rule called an inference rule, which infers a clause (A + B) from two 
clauses (A + x) and (B + x), where each of A and B denotes a sum of literals such 
that there is no variable y that appears positively (negatively, resp.) in A and 
negatively (positively, resp.) in B. We say that the variable x is deleted by this 
inference. A resolution refutation for f is a sequence of clauses Ci, C 2 , • • • , Ct, 
where each Ci is a clause in / or a clause inferred from clauses Cj and Ck 
{j, k < i), and the last clause Ct is the empty clause (0). The size of a resolution 
refutation is the number of clauses in the sequence. 
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A resolution refutation can be represented by a directed acyclic graph. (In this 
sense, we sometimes use the term DAG-like resolution instead of “resolution.”) 
If the graph is restricted to a tree, namely, if we can use each clause only once 
to infer other clauses, then the refutation is called tree-like resolution refutation 
and the tree is called a resolution refutation tree {rrf). More formally, an rrt for a 
formula / is a binary rooted tree. Each vertex v of an rrt corresponds to a clause, 
which we denote by Cl{v). Cl{v) must satisfy the following conditions: For each 
leaf V, Cl{v) is a clause in /, and for each vertex v other than leaves, Cl{v) is 
a clause inferred from Cl{vi) and Cl{v 2 ), where v\ and V 2 are w’s children. For 
the root v, Cl{v) is the empty clause (0). The size of an rrt is the number of 
vertices in the rrt. 

Backtracking (e.g., 0 ) determines whether a given CNF formula is satisfiable 
or not in the following way: For a formula /, variable x and a G {0, 1}, let fx=a 
be the formula obtained from / by substituting value a for x. In each step, we 
choose a variable x and calculate fx=o and fx=i recursively. If / is simplified to 
the constant true function at any point, then / is satisfiable, and otherwise, / 
is unsatisfiable. 

Backtracking search is also represented by a tree. A backtracking tree (htf) for 
an unsatisfiable formula / is a binary rooted tree satisfying the following three 
conditions: (i) Two edges ei and 62 from a vertex v are labeled as (a: = 0) and 
{x = 1) for a variable x. (ii) For each leaf v and each variable x, x appears at 
most once (in the form of (x = 0) or (x = 1)) in the path from the root to v. 
(iii) For each vertex v, let As(u) be the (partial) truth assignment obtained by 
collecting labels of edges in the path from the root to v. Then, for each v, v is 
a leaf iff / becomes false by As(n). (Recall that we consider only unsatisfiable 
formulas.) The size of a btt is the number of vertices in the btt. 

Proposition 1. Let f be an unsatisfiable CNF formula. If there exists an rrt 
for f whose size is k, then there exists a btt for f whose size is at most k. 

Proof. Let R be an rrt for /. It is known that a shortest tree-like resolution 
refutation is regular, i.e., for each path from the root to a leaf, each variable is 
deleted at most once [^. Thus we can assume, without loss of generality, that R 
is regular. 

From R, we construct a btt B which is isomorphic to R. What we actually 
do is to give a label to each edge in the following way: Let Vi be a vertex of R 
and let Vi^ and be its children. Suppose Cl{vi) = {A-\- B), Cl{vi^) = {A-\-x) 
and Cl{vi.f) = {B-\-x). Then the labels (a: = 0) and (a; = 1) are assigned to edges 
{ui, Ui^) and {ui, Ui^), respectively, where Ui is a vertex of B corresponding to vt 
of R. We shall show that this B is a btt for /. 

It is not hard to see that the conditions (i) and (ii) for btt are satisfied. 
In the following, we show that for any leaf u oi B, f becomes false by As(m). 
This is enough for the condition (iii) because if / becomes false in some non- 
leaf node, then we can simply cut the tree at that point and can get a smaller 
one. To this end, we prove the following statement by induction: For each i, the 
clause Cl{vi) of R becomes false by the partial assignment As{ui) of B. For the 
induction basis, it is not hard to see that the statement is true for the root. 
For the induction hypothesis, suppose that the statement is true for a vertex 
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Vi, i.e., Cl{vi) becomes false by As{ui). Now we show that the statement is also 
true for Vi’s children. Let Vi^ and be Vi's children, and let Cl{vi) = {A + B), 
Cl{vi^) = {A + x) and Cl{vi^) = {B + x). Then the label of the edge is 

{x = 0), and hence, As(wiJ = As{ui) U {(a; = 0)}. Since As{ui) makes {A + B) 
false, As{ui,) makes {A + x) false. The same argument shows that As{ui,^) makes 
Cl{vi^) false. Now the above statement is proved, which immediately implies that 
As{ui) makes / false for every leaf Ui oi B. □ 

Thus to show a lower bound of tree-like resolution, it suffices to show a lower 
bound on the size of btts. 

3 A Lower Bound 

In this section, we prove a lower bound on the size of tree- like resolution for 

Theorem 1. Any tree-like resolution refutation for requires (j)^ 

steps. 

Proof. By Proposition 1 , it is enough to show that any btt for PHP)f~^^ requires 
(^) 4 vertices. For simplicity, we consider the case that n is a multiple of 4. Let 
B be an arbitrary btt for PHP)f~^^. As we have seen before, each vertex v of B 
corresponds to a partial truth assignment As(u). For a better exposition, we use 
an n-|- 1 by n array representation to express a partial assignment for PHPf)'^^. 
Fig. 1 shows an example of PHP^. A cell in column i and row j corresponds to 
the variable Xij. We consider that the value 1 (resp. 0) is assigned to the variable 
Xij if the {i,j) entry of the array is 1 (resp. 0). For example. Fig. 1 (b) expresses 
a partial assignment such that xi^j^ = X2,2 = a^4,4 = 2^5,3 = 0, = CC44 = 1 . It 

should be noted that PHPf)~^^ becomes false at a vertex v iff (i) As(-u) contains 
a column filled with Os or (ii) As(r;) contains a row in which two Is exist. 

Here are some notations. For a partial assignment A, a 0 in the {i,j) entry 
of A is called a bad 0 if neither the column i nor the row j contains a 1 , and 
is called a good 0 otherwise. (The reason why we use terms “bad” and “good” 
will be seen later. Bad Os make it difficult to count the number of vertices in B 
in our analysis.) ffBZ{A) denotes the number of bad Os in A. A variable Xij is 
called an active variable for A if Xij is not yet assigned (that is, (z, j) entry of A 
is blank) and neither the column i nor the row j contains a 1 . For example, let 
Aq be the assignment in Fig. 1 (b). Then ffBZ(Ao) = 2 (Os assigned to and 
X2,2)- For example, 0:2,4 is an active variable for Aq. Let v be a vertex of B and vq 
and v\ be its children. Suppose that labels of edges {v, uq) and (v, vi) are (x = 0) 
and (x = 1 ), respectively. Then we write Var{v) = x, namely, Var{v) denotes 
the variable selected for substitution at the vertex v. We call vq a false-child of v 
and write F{v) = Vg. Similarly we call Vi a true-child of v and write T{v) = V\. 

We want to show a lower bound on the number of vertices in B. To this end, 
we construct a tree S from B. Before showing how to construct S, we show some 
properties of S: The set of vertices of S' is a subset of the set of vertices of H, so 
the number of vertices of S gives a lower bound on the number of vertices of B. 
Each internal vertex of S have either a single child or exactly ^ children. The 
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Fig. 1. An array representation 
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a partial assignment for PH Pi 



height of S' is more precisely, the length of any path from the root to a leaf 
is exactly 

Now we show how to construct S. The root of S is the root of B. For each 
vertex v of S, we select the set CH{v) of u’s children in the following way: CH{v) 
is initially empty and vertices are added to CH{v) one by one while tracing the 
tree B down from the vertex v. We look at vertices T{v), T{F{v)), T{F{F{v))){= 
T{F^{v))), ■■■, T{F^{v)), •••in this order. We add T{F'-{v)) {I > 0) to CH{v) 
if Var{F\v)) is an active variable for As(F'^(v)). Fig. 2 illustrates how to trace 
the tree B when we construct the tree S. In this example, T{F‘^{v)) is “skipped” 
because Var{F^{v)) is not an active variable. We stop adding vertices if \CH{v)\ 
becomes j. In this case, v has exactly j children. 




Fig. 2. A part of S constructed from B 



However, there is one exceptional condition to stop adding vertices to CH(v) 
even if \CFl(v)\ is less than when the number of bad Os in some column 
reaches we stop adding vertices to CH{v). More formally, let us consider a 
vertex F\v). Suppose that the number of bad Os in each column of As{F^{v)) 
is at most f — 1- Also, suppose that Var{F’’{v)) is Xi^j where the column i 
of As{F\v)) contains exactly f — 1 bad Os and there is no 1 in the row j 
of As{F\v)). (See Fig. 3 for an example of the case that n = 12. There are 
eight Os in the column i. Among them, five Os are bad Os.) Then 
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contains ^ bad Os in the column i, and hence we do not look for vertices any 
more, namely, T{F\v)) is the last vertex added to CH{v). (Note that T{F\v)) 
is always selected since Xij is an active variable for As{F’’{v)).) In this case, 
\CH{v)\ may be less than j. If so, we adopt only the last vertex as a child of v, 
i.e., only T{F’’{v)) is a child of v in S. Thus, in this case, v has only one child. It 
should be noted that, in the tree S, every assignment corresponding to a vertex 
of depth i contains exactly i Is. We continue this procedure until the length of 
every path from the root to a leaf becomes ^ . 




Fig. 3. A condition to stop tracing B 



We then show that it is possible to construct such S from any B. To see this, 
it suffices to show that we never reach a leaf of B while tracing B to construct S. 
Recall the definition of a backtracking tree: For any leaf u of btt, As(w) makes 
the CNF formula false. For As(m) to make false, As(u) must have a 

row containing two Is or a column full of Os. The former case does not happen 
in S because we have skipped such vertices in constructing S. The latter case 
does not happen for the following reason: Recall that once the number of bad Os 
in some column reaches we stop tracing the tree B. So, as long as we trace 
B in constructing S, we never visit an assignment such that the number of bad 
Os in a column exceeds f — 1. Also, recall that the number of Is in As(u) is 
at most § since v’s depth in S is at most |. So the number of good Os in a 
column is at most Hence the number of Os in each column is at most n — 1, 
and so, no column ever becomes filled with Os. Now let us consider the following 
observation which helps to prove later lemmas. 

Observation 1. Consider a vertex v in B and let v' = F\v) for some I (see Fig. 
4). Suppose first that T{v') is added to CH{v) in constructing S. Then Var{v') 
must be an active variable for v', and hence ^BZ{As{F{v'))) = ^BZ{As{v')) + 
1. On the other hand, suppose that T{v') is not added to CFl{v) in constructing 
S. Then Var(v') is not an active variable for v'. Therefore, ^BZ{As{F{v'))) = 
#BZ{As{v')). 

Now we have a tree S having the following properties: The length of any path 
from the root to a leaf is Every vertex in S except for leaves has exactly ^ 
children or one child. When a vertex v has one child, we call the edge between 
V and its child a singleton. 
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T(v') 




F(v') 



Fig. 4. An example for Observation 1 



Lemma 1. Consider an arbitrary path P = uqU\U 2 ■ ■ ■ in S, where uq is 
the root and un is a leaf. For each k, if (uk-i,Uk) is not a singleton, then 

#BZ{As{uk)) < #BZ{As{uk-i)) + f - 1. 

Proof. Consider u^-i and in the path P. Let v be the vertex in the btt B 
corresponding to Uk-i. Then there is some I such that T{P\v)) corresponds to 
Uk. By Observation 1, ffBZ{As{F’'{v))) — ffBZ{As{v)) is equal to the number 
of vertices added to CH{v) among T{v),T{F{v)),T{F^{v)),- ■ ■ , T{F^~^{v)). So 
ffBZ{As{F\v))) — ffBZ{A.s{v)) < j — 1. Note that As{T{F\v))) is the result 
of adding one 1 to some active variable of As^F^v)), so ffBZ{As{T{F^{v)))) < 
ffBZ{As{F\v))). Hence ffBZ{As{T{F\v)))) < ffBZ{As{v)) + ^ — 1, namely, 
#BZ{As{uk)) < #BZ{As{uk-i)) + f - 1. □ 

Lemma 2. Consider an arbitrary path P = uqU\U 2 ■ ■ ■ in S, where uq 
is the root and un is a leaf. For each k, if (uk-i,Uk) is a singleton, then 

#BZ{As{uk)) < #BZ{As{uk-i)) - f • 

Proof. Suppose that the edge (uk-i,Uk) is a singleton. Let v and T{F^{v)) 
be vertices in B corresponding to Uk-i and Uk, respectively. The same argu- 
ment as in Lemma 1 shows that ffBZ{As{F\v))) — ffBZ{As{v)) < j — 1. 
Let Var{F\v)) = Xij. Since (uk-i,Uk) is a singleton, there are f — 1 bad 
Os in the column i of As{F’’{v)). Thus substituting the value 1 for the va- 
riable Xij makes at least f — 1 bad Os good, namely, ffBZ{As{T{F^))) < 
^BZ{As{F\v))) - (f - 1). Hence ^BZ{As{T{F\v)))) < #BZ{As{v)) - 
namely, ffBZ{As{uk)) < ffBZ{As{uk-i)) - y- □ 

Lemma 3. Consider an arbitrary path P = uqUiU 2 • • • in S, where Uq is the 
root and is a leaf. The number of singletons in P is at most j. 

Proof. Suppose that there exist more than j singletons in the path P. We count 
the number of bad Os of assignments along P. At the root, ffBZ{As{uo)) = 0. 
Going down the path from the root uq to the leaf , the number of bad Os is 

increased at most (^ — 1)(^ — 1) < by Lemma 1, and is decreased at least 
f(f + 1) > by Lemma 2. This is a contradiction because the number of bad 
Os becomes negative at the leaf. This completes the proof. □ 
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Fig. 5. Shrinking S to obtain S' 



Finally we “shrink” the tree S by deleting all singletons from S. Let S' be 
the resulting tree (see Fig. 5). Each vertex of S' has exactly j children and the 
length of every path from the root to a leaf is at least j by Lemma 3. So there 
are at least vertices in S', and hence, the theorem follows. □ 

Remark. By slightly modifying the above proof, we can get a better lower 
bound of ( 4 ;og 2 For the tree S in the above proof, we restrict the number of 
children of each node, the maximum number of bad Os in each column, and the 
height of the tree with f , f , and respectively. To get a better lower bound, 
we let them be 6'^n, 5n, and (1 — 5)n, respectively, with 5 = Then, the 

number of singletons in each path is at most <5(1 — (5) n and hence we can obtain 
a lower bound 

4 An Upper Bound 

It is known that the size of a DAG-like resolution refutation for is 

0(n^2”)|3]. Here we show a slightly better upper bound which is obtained by 
the similar argument as [4j. We can also obtain an upper bound of tree- like 
resolution refutation for as a corollary. 

Theorem 2. There is a DAG-like resolution refutation for P H P^^^ whose size 
is 0(n^2"). 

Proof. Let Q and R be subsets of {1, 2, • • • , n-|- 1} and {1, 2, • • • , n}, respectively. 
Then we denote by Pq,r the sum of positive literals Xij, where i G Q and j G R. 
Let [i,j] denote the set {i,i-\- I, - ■ ■ ,j — 1, j}. 

We first give a rough sketch of the refutation and then describe it in detail. 
The 0th level of the refutation has the single clause P{i}, [!,«]. The first level 
consists of n clauses for all sets C [l,n] of size n — 1. The 

second level consists of „C „_2 clauses P[i^ 3 ]__R(n- 2 ) for all sets R^"~^'> C [l,n] of 
size n—2. Generally speaking, the ith level consists ofnCn-i clauses P[iy+i],fl("-o 
for all C [1 , n] of size n — i. At the {n — l)th level, we have nCi = n 

clauses P[i,n],{i}i ^^[i,n],{2}! • • • j -P[i,n],{n}- Finally, at the nth level, we have the 
empty clause. We call the clauses described here main clauses. Note that there 
are SfzQ^nCi) = 2" main clauses. Fig. 6 shows an example of the case when 
n = 4. A “-I-” sign in the (i,j) entry means the existence of the literal Xi^j in 
that clause. 
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Fig. 6. Main clauses of the refutation 



Then we describe the detail of the refutation. Each clause at the fth le- 
vel is obtained by using i clauses of the {i — l)th level and some initial clau- 
ses. To construct a clause P[i.i+i],{ji,j2,---Jn-i} level, we use i clauses 

foi' all k ^ First, for each k, we construct 

a clause U Xi+i^k using the clause of the 

{i - l)th level and i clauses (SIT -I- Xi+i^k){x^ + Xi+i^k) ■ ■ ■ -I- xTfyfe). 

Then we construct a target clause j„-i} by using those i clau- 

ses P[i,i],{ji,j2,---,jn-i} U Xi+i^k and the initial clause P{i+i}^[i,n]- Fig. 7 illustrates 
an example of deriving F’[i,3],{i,4} in the second level from clauses F’[i,2],{i,2,4} 
and F’[i.2],{i, 3,4} in the first level. Similarly as the “-I-” sign, a ” sign in the 
(i,j) entry means the existence of the literal Xij in that clause. 




+ 1 +1 



_+ 

+ + 



+ + 
+ + 



+ 1+1 



Fig. 7. Constructing a main clause 
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Thus to construct each main clause, we need 0{n?) steps. Since there are 2” 
main clauses, the size of the above refutation is bounded by 0(n^2"). □ 

Corollary 1. There is a tree-like resolution refutation for whose size 

is 0(n^n!). 

Proof. This is obtained by reforming the directed acyclic graph of the refutation 
obtained in Theorem 2 into a tree in a trivial manner. For main clauses, we 
have one level-n clause, n level-(n — 1) clauses, n(n — 1) level-(n — 2) clauses 
and so on. Generally speaking, we have n{n — 1) • • • (i + 1) level-i clauses. Thus 
we have 1 + SlZo~^n{n — 1) • • • (i + 1) < 2n! main clauses. Each main clause is 
constructed in O(n^) steps and hence the size of the refutation is 0(n^n!). □ 

5 Concluding Remarks 

By Theorems 1 and 2, we can see that the size of any tree-like resolution refuta- 
tion is superpolynomially larger than the size of a shortest DAG-like resolution 
refutation. An interesting future research is to find a set of formulas that sepa- 
rates tree-like and DAG-like resolutions in the rate of 2°" for some constant c 
improving |2]. Another research topic is to find a tighter bound of the tree- like 
resolution for the pigeonhole principle. Note that an upper bound 0(n^n!) and 
a lower bound obtained in this paper are tight in the sense that they 

both grow at the same rate of An open question is whether we can 

get a tighter lower bound, e.g., 

Acknowledgments. The authors would like to thank Magnus M. Halldorsson 
for his valuable comments. 
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Abstract. In this paper we study two practical variations of the map 
labeling problem: Given a set S' of n distinct sites in the plane, one needs 
to place at each site: (1) a pair of uniform and non-intersecting squares 
of maximum possible size, (2) a pair of uniform and non-intersecting 
circles of maximum possible size. Almost nothing has been done before 
in this aspect, i.e., multi-label map labeling. We obtain constant-factor 
approximation algorithms for these problems. We also study bicriteria 
approximation schemes based on polynomial time approximation sche- 
mes (PTAS) for these problems. 



1 Introduction 

Map labeling is an old art in cartography which finds new applications in re- 
cent years in GIS, graphics and graph drawing. Extensive research have been 
conducted in this area [IBC94ICMS93ICMS95IDF92IFW91IImh75IJon89] 
[IKR92IPZC98|Wag94| . Recently much of the research are on generalizing the 
problems (models) . One direction is to allow each site to have many, sometimes 
an infinite number of, possible labels (see |DMMMZ97IIL97IKSW98| 'l. Another 
direction is to study the corresponding problem in a related area such as graph 
drawing |KT97IKT98a| . On the other hand, many of the realistic problems, like 
map labeling with multiple-labels and different label shapes, have not received 
much attention. 

The multiple-label map labeling problem comes from our weather forecasting 
programs on TV where each city has two labels: its name and temperature. This 
naturally gives rise to two optimization problems: namely, that of maximizing 
label sizes and maximizing the number of sites labelled. This realistic example 
of map labeling, which everybody encounters almost daily nowadays, is largely 
ignored in the research of map labeling. The only result in this respect we know 
of is a very recent short note by Kakoulis and Tollis |KT98] which presents 
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practical heuristic algorithm for solving this problem. However, no theoretical 
result is known. 

Although the computational complexity of these problems are not yet known, 
it is very likely that they are all NP-hard. Thus, finding efficient approximate 
solutions is meaningful. We will mainly follow |DMMMZ97| to obtain efficient 
approximation solutions and approximation schemes for these problems. Howe- 
ver, these extensions are non-trivial. In fact, in many ways we either simplify 
the methods in [DMMMZ97J or further generalize the results there. 

This paper is organized as follows. Section 2 contains a brief summary of 
results. Section 3 formally defines the problems. Section 4 and Section 5 discusses 
the constant factor approximation algorithms for labeling a set of sites with 
uniform (axis-parallel) squares and uniform circle pairs. Section 6 contains a 
bicriteria approximation scheme for these problems. Section 7 concludes the 
paper. 

2 A Summary of Results 

In this paper, for the first time in literature, we study approximation algorithms 
for the general multi-label map labeling problem with uniform (axis-parallel) 
square pairs and uniform circle pairs. We obtain constant factor approximation 
algorithms for these problems. Recall that an approximation algorithm for a 
(maximization) optimization problem 77 provides a performance guarantee 
of p if for every instance I of 77, the solution value returned by the approxima- 
tion algorithm is at least 1/p of the optimal value for 7. (For the simplicity of 
description, we simply say that this is a factor p approximation algorithm for 
77.) Our main results are summarized as follows: 

1. For map labeling with uniform square pairs, we design a polynomial time 
approximation algorithm with a performance guarantee of 4. 

2. For map labeling with uniform circle pairs, we design a time-optimal appro- 
ximation algorithm with a performance guarantee of 2. 

3. We study bicriteria approximation schemes for the above two problems. 
Bicriteria approximation scheme for the map labeling problem is defined 
as that for any given e, (1 — e) fraction of sites may be chosen in order to 
get a solution which is at least (1 — c' • e), where d is some constant, times 
the size of the optimal solution. 

3 Preliminaries 

In this section we formally define the problems to be studied. We also make some 
extra definitions related to our algorithms. To make our descriptions easily ac- 
cessible, we restrict any square used in this paper to be axis-parallel — although 
by sacrificing the performance of the algorithms we can generalize the result to 
arbitrary uniform square pairs. 
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3.1 Map Labeling with Uniform Square Pairs (MLUSP) 

The decision version of the MLUSP problem is defined as follows: 

Instance: Given a set S of points (sites) pi, P 2 , •••, Pn in the plane and an 
integer 1. 

Problem: Does there exist a set of n pairs of squares of size (i.e., length of 
a side) I each of which are placed at each input site pi G S such that no two 
squares intersect and no site is contained in any square. 




(a) 



(b) 



Fig. 1. Examples of labeling sites with square and circle pairs. 



It should be noted that in this problem the site can be anywhere on the 
boundary of the two labeling squares (Figure 1 (a)). It is not known whether the 
MLUSP problem is NP-complete or not. Nevertheless, we will try to approximate 
the corresponding maximization problem in the subsequent sections. From now 
on, MLUSP will always refer to the maximization problem. 



3.2 Map labeling with Uniform Circle Pairs(MLUCP) 

The problem of MLUCP is defined as follows: 

Instance: Given a set S points (sites) pi, p 2 , •••, Pn in the plane and an integer 
k. 

Problem: Does there exist a set of n uniform circle pairs of radius I each of 
which are placed at each input site Pi G S such that no two circles intersect and 
no site is contained in any circle. 

Because of the nature of this problem, the two circles labeling a site Pi must 
be tangent to each other and pi is exactly the tangent point (Figure 1 (b)). 
Again, it is not known whether the MLUGP problem is NP-complete or not. 
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3.3 The Minimum 3-Diameter under Too 

Let fc > 2 be an integer. Given a set S of k sites (sites) in the plane, the k- 
diameter of S under the Lqo nietric (or L^o fc-diameter in short) is defined as 
the maximum Lao distance between any two points in S. Given a set S of at least 
k sites in the plane, the min-Zc-diameter of S under Loo, denoted as Dk,oo{S), is 
the minimum Lao fc-diameter over all possible subsets of S of size k. We can also 
define a similar set of notations on the L2 (i.e.. Euclidean) metric. In particular, 
we denote by Dk{S) the minimum Zc-diameter (under L2) over all possible sub- 
sets of S of size k. Observe that D/^^aoiS) < Dk{S) for any S and k. 

The following two lemmas show why the 3-diameter, in particular, the 3- 
diameter under Lao is relevant to the labeling of square pairs. 

Lemma 1. Given a set of three points with 3-diameter D^ ao under Lao, the 
optimal labeling square pairs have size at most D^ ao- 

Proof. Let the three points be Pi,P 2 and p^. Without loss of generality, let D^ ao 
be determined by pi and P 2 - Glearly when we look at the axis-parallel bounding 
box of the three points, its long edge (which is exactly D^^ao) is the maximum 
size for labeling a square pair on pa. □ 



r 

I 

I 



Pi 





P2 



(a) 



(b) 



Fig. 2. Labeling 3 sites with maximum square pairs. 



In Figure 2, we show how to label three points with maximum square pairs 
(the two dotted squares are for pa as pi , P 2 can be easily labelled with pairs of 
squares of the same size). The following lemma extends the previous one to the 
case when S has more than 3 sites. 

Lemma 2. Given a set S of n sites with minimum Lao 3-diameter D^^ao{S), 
the optimal labeling square pairs have size at most I?a,oo(5'). 
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In the following section we will present an approximation solution for labe- 
ling a set S' of n sites using square pairs. For the ease of presentation, we will 
first use D^{S), the min-3-diameter of the same set of sites under L 2 , to appro- 
ximate 1 ) 3,00 (S) — and to obtain an approximation solution whose performance 
guarantee is slightly worse than 4. Then we show how to generalize the idea to 
use 1 ) 3,00 (S) to obtain a factor 4 approximation algorithm. Given a set of n sites 
S, both Z) 3 ,oo(S) and Ds{S) can be computed in O(nlogn) time |EE94| . 

4 Map Labeling With Uniform Square Pairs (MLUSP) 

In this section we present the details of an approximation algorithm for the 
MLUSP problem. Let L* denote the size of each square in the optimal solution 
of the problem MLUSP. By Lemma 2 and the fact that 1 ) 3 , 00 (•S') < we 

have the following lemma. 

Lemma 3. If liSI < 2, then L* is unbounded and if [S'! > 3, then L* < D^{S). 

For any two points Pi,Pj € S, let dij denote the Euclidean distance between 
them. Let Ci denote the circle centered at point pt G S with radius where 

a > 4. As shown in [DM M M/97j . the circle Ci contains at most two points 
from the input set S, including its center. (Otherwise, the three points inside 
Ci will have a diameter smaller than D^{S).) Let pj, if exists, be the site in 
the circle Ci. Clearly pi is the only point contained in Cj. Since a > 4, by the 
triangle inequality, no circle Ci can intersect more than one circle. (Otherwise, if 
Ci intersects Cj, Ck then djk < dij + dik < DsiS) and the diameter of Pi,Pj,Pk 
will be smaller than 1 ) 3 ( 5 ').) 

Notice that as Q can intersect at most one circle, if Ci is empty of other 
sites then trivially we can label a pair of squares with edge length D 3 {S)/\/ 2 a. 
Therefore, we only consider the (more difficult) case when Ci contains another 
site pj. By definition, Pi,Pj G CiCi Cj. Therefore, we always have a half of Ci, 
bounded away from pj by a horizontal or a vertical line through pi, being empty. 
Let Xi be the square pair with edge length placed at point pi. If pi is 

above pj then we place a square pair Xi above pi and a square pair Xj below 
Pj . If Pi is below Pj then we place a square pair Xi below pi and a square pair 
Xj above pj. If pi and pj are on the same horizontal line then we place a square 
pair Xi to the left of pi and a square pairs Xj to the right of pj . 

Let L' denote the edge length of each square generated by above procedure. 
We have the following lemma. 

Lemma 4. L' > Moreover V > 

v2q; v2q! 

The proof of this lemma is straightforward as L' > 

Summarizing the above results, we have the following theorem. 

Theorem 5. For any given set of n points in the plane, the above algorithm, 
which runs in 0{nlogn) time, produces a 4\/2 = 5.656 approximation for the 
MLUSP problem. 
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The algorithm involves first computing ZI3 (S') and then labeling S using the 
above constructive procedure. We hence omit the details. 

Now it is quite obvious to generalize this algorithm by using directly: 

Just generalize Ci as the circle under the L^a metric — which is geometrically 
a square, centered at point pi G S with radius Again, since a > 4, no 

circle can contain more than two sites and no circle can intersect more than one 
circle. The remaining steps are the same as shown in the above theorem and we 
are able to slice off the -\/2 factor. Therefore we have 

Theorem 6. For any given set of n points in the plane, there exists a 4 appro- 
ximation algorithm which runs in 0 (nlogn) time for the MLUSP problem. 



5 Map Labeling With Uniform Circle Pairs (MLUCP) 

In this section, we study the MLUCP problem. Let R* denote the radius of 
each circle in the optimal solution. Recall that 02(8) is the Euclidean distance 
between a closest pair of S. We have the following lemma. 

Lemma 7. R* < 02(8) / 2 . 

Proof. Consider the two points Pi and Pj which is a closest pair of 8. Thus they 
are at distance dij = 02(8) apart. To avoid the intersection of the two pairs of 
circles and maximize the size of these circles, the best way is to place the two 
pairs of circles such that the lines through the centers of each pair are parallel 
to each other and perpendicular to (pi,pj). In this way the maximum radius of 
these circles is dij /2 = 02(8) / 2 . Clearly, the optimal solution for the problem 
MLUCP, R* must be bounded by 02(8) / 2 . □ 

5.1 Algorithm 

We are now ready to present an approximation algorithm for the MLUCP pro- 
blem. We need an algorithm called ClosestPair to compute 02(8). 

Procedure PackCirclePair{8) 

1 . 02 ( 8 ) := ClosestPair(S'). 

2. For each point pi in 8, do the following: 

Place a pair of circles U of radius 1)2 (S')/! at pi arbitrarily. 

The correctness of the above algorithm rests on Step 2. For every site pi, let 
Ci be a circle of radius 02(8 )/ 2 centered at pi. Then no two such circles can 
intersect; otherwise there would be a pair of sites whose Euclidean distance is 
less than 02(8), a contradiction. As the radius of each circle in the circle pair 
Yi is only Z?2(S)/4, T) is circumscribed by Ci and hence will not overlap with 
any other circle pairs. Using the standard results of [PS85j . we can find 02(8) 
in O(nlogn) time. Therefore the whole algorithm takes 0(n log n) time and we 
have the following theorem. 
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Theorem 8. The above 0(nlogn) time algorithm finds an approximate solution 
with performance guarantee of 2. 

Clearly the 0(n log n) time bound is optimal: in the degenerate case when 
all the y-coordinates of the points in S are the same, any (approximate) solution 
for MLUCP returns a non-zero value if and only if all the x-coordinates of the 
points in S are distinct. The latter is the famous element uniqueness problem, 
which has an l7(nlogn) lower bound |B083J . 

6 A Bicriteria Approximation Scheme 

In reality, factor-4 or factor-2 approximations to the MLUSP and MLUCP pro- 
blems could be far from satisfactory. In this section, we consider a bicriteria 
variant of the basic problem. We show that if we are allowed not to label a 
small subset of sites we can have an approximate solution which is almost opti- 
mal. We only focus on the MLUSP problem as the generalization to MLUCP is 
straightforward. 

First of all, we need a few definitions. Define a polynomial time (a, (3) appro- 
ximation algorithm for the MLUSP problem as a polynomial time approximation 
algorithm that finds a placement of square pairs for at least an a fraction of sites 
such that the size of each square is at least j3 times the size of a square in an 
optimal solution that places square pairs at each site. 

An undirected graph is a unit square graph if and only if its vertices can be 
put in one-to-one correspondence with equal sized squares in the plane in such 
a way that two vertices are joined by an edge if and only if the corresponding 
squares intersect. (When dealing with this graph we assume that tangent squares 
intersect.) For any fixed A > 0, we say that a square graph is a X- precision square 
graph if the centers of any two squares are at least A distance apart. 

Finally we recall some graph theoretic definitions for the sake of completeness. 

Maximum Independent Set (MIS): Given a graph G = (V,E), a maximum 
independent set for G is a maximum cardinality subset V of V such that for 
each u,v £ V', (u,v) ^ E. 

It is well known that the maximum independent set problem is NP-complete, 
even when restricted to square graphs [GJ79| . But as shown in |HM-|-94| , if we 
are given a geometric specification of the squares the corresponding maximum 
independent set problem has a polynomial time approximation scheme (PTAS). 



6.1 Overview 

The basic idea of our (a, (3) approximation algorithm is to reduce the MLUSP 
problem to the MLUS problem. Then following the approach in [DMMMZ97j . 
we further reduce the MLUS problem to the problem of finding maximum inde- 
pendent sets in a collection of A-precision unit square graphs. 
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A simple idea for the first reduction is to duplicate each site in the input 
to MLUSP and then solve for the MLUS problem. Unfortunately, the second 
reduction would then fail: As discussed in the above definition, if two squares 
touch each other, there will be an edge in the corresponding unit square graph. 
However, we need to label each site with a pair of squares, which certainly 
touch each other. Then there would be some problems in transforming these two 
squares into the vertices of the unit square graph — if possible, eventually we 
do not want to exclude one of these squares. 

We deal with this problem by splitting a site (x, y) into two ‘dummy’ sites: 
{x — S,y — S) and (a: + (5, y + (5). We call the new problem of labeling these 2n 
sites with 2n uniform squares MLUS-2. We denote the new set of sites as S 2 n- 

Theorem 9. Let OPT 2 be the optimal solution of MLUS-2 and let OPT = L* 
be the optimal solution of MLUSP. Then OPT 2 > {OPT —25) > {\-c-5)OPT, 
where c is some constant. 

Proof. If we shrink OPT by an additive factor of 25 then after we split (and 
translate) a site into two dummy sites any shrunk (and translated) square does 
not intersect with any other shrunk square. So OPT — 25 gives a feasible solution 
to MLUS-2 and by the optimality of OPT 2 we have OPT 2 > {OPT — 25). By 
choosing a suitable constant c, OPT 2 > {OPT — 25) > (1 — c • 5)OPT. □ 

Now having constructed an input for MLUS-2 from an input for MLUSP, 
all we need to do is to have a bicriteria approximation scheme for MLUS-2. As 
mentioned before, the basic idea is to reduce the MLUS-2 problem to finding 
maximum independent sets in a collection of unit square graphs as done similarly 
in |DMMMZ97| . Below, we outline the major idea in labeling at least (1 — e) 
fraction of sites in with squares of size at least (1 — e)/(l -I- e) of the optimal 
solution for MLUS-2, where e is some small constant. 

It was shown in | DMMMZ97| that the optimal solution for MLUS-2, OPT 2 , 
is bounded above by D^{S 2 n) which is the min-5-diameter of S 2 n under L 2 . 
Finding D^{S 2 n) takes only O(nlogn) time by |EE94| . Therefore we can search 
for a good estimate of OPT 2 in the range [0,D5(S'2n)] using powers of (1 -I- e), 
where 5 > e. (Notice that we can scale this interval such that OPT 2 >1 — 
this will only affect the running time of the algorithm by a constant factor. 
Therefore, without loss of generality, we assume that OPT 2 > 1.) At each stage 
we discretize the edge of a square into roughly 1/e points at which a (dummy) 
site can be located. Correspondingly, at each stage we change this problem into 
a maximum independent set problem in a ^v^-precision unit square graph in 
which each vertex corresponds to a discretized unit square — a possible label 
for a site. Although this problem in NP-hard, it has been shown in |HM-|-94| 
that the maximum independent set problem for unit square graphs specified 
geometrically has a polynomial time approximation scheme. We use this as a 
subroutine to find a near optimal collection of non-overlapping squares to be 
placed at those feature points. 

So we have successfully reduced the MLUS-2 problem into that of finding 
maximum independent sets of a number of square graphs. Specifically, given an 
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instance of S 2 n, we construct 0(log -Ds(5'2n)) square graphs, each of size polyno- 
mial in 2n and 1/e. For each of these square graphs we obtain an approximate 
solution to the Maximum Independent Set problem — for which a polynomial- 
time approximation scheme is known | HM-I-94| . 

As OPT 2 < D^{S 2 n), there exists some iteration k' such that (1 -b e)^ < 
OPT 2 < (1 + By |HM-I-94| . in 0(log I? 5 (S' 2 n)) iterations we can find a 

placement of (1 — e)-2n squares whose size is at least (l-be)^ — e(H-e)^ . Clearly 

(1 + ef - e(l + ef > (1 + e)'='(l - e) > • OPT 2 . 

(1 + e) 

Consequently we have the following theorem. 

Theorem 10. For any fixed 6 > e> Q, given an instance ofn points of MLUSP, 
we can find a placement of at least (1 — 2e) • n square pairs with size of at least 
{l — 2e — cS + o{eS))OPT where c is some constant and OPT denotes the optimal 
solution. 

Proof. When approximating OPT 2 , we fail to label at most e-2n (dummy) sites. 
In the worst case this will destroy at most the same number of square pairs when 
we convert MLUS-2 back to MLUSP — which is simply done by translating the 
two squares associated with a pair of dummy sites back to the original site. 

As we approximate OPT 2 with a factor of at least , following Theorem 
9, overall we approximate OPT with a factor of • (1 ~ c • i5), which is 
{l — 2e — c-S + o{eS)). In other words after we transform the approximate solution 
of MLUS-2 back to MLUSP we have an approximate scheme for MLUSP with 
performance guarantee of at least (1 — 2e — cc5 -b o{eS))OPT. □ 

The running time of this algorithm is 0{nlog D^{S 2 n)) as the algorithm for 
approximating the maximum independent set in a A-precision unit square graph 
takes 0(n) time |HM-b94| . The construction of input for MLUS-2 from input for 
MLUSP and the construction of output for MLUSP from output for MLUS-2 
both takes linear time. 

7 Concluding Remarks 

One of the major questions regarding this paper is the computational complexity 
of the two problems we have just studied, namely the MLUSP and MLUCP. Are 
they NP-complete? The second question is whether the approximation constant 
factors we have achieved (i.e., 4 and 2) are optimal or not. 
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Abstract. We study the scheduling of a set of jobs, each characterised 
by a release (arrival) time and a processing time, for a batch processing 
machine capable of running (at most) a fixed number of jobs at a time. 
When the job release times and processing times are known a-priori and 
the inputs are integers, we obtained an algorithm for finding a schedule 
with the minimum makespan. The running time is pseudo-polynomial 
when the number of distinct job release times is constant. We also ob- 
tained a fully polynomial time approximation scheme when the number 
of distinct job release times is constant, and a polynomial time approxi- 
mation scheme when that number is arbitrary. When nothing is known 
about a job until it arrives, i.e., the on-line setting, we proved a lower 
bound of (-\/5 -I- l)/2 on the competitive ratio of any approximation al- 
gorithm. This bonnd is tight when the machine capacity is unbounded. 



1 Introduction 

Job scheduling on a batch processing system is an important issue in the ma- 
nufacturing industry. As a motivating example, the manufacturing of very large 
scale integrated circuits requires a burn-in operation in which the integrated cir- 
cuits are put into an oven in batches and heated for a prolong period (in terms 
of days) to bring out any latent defect. (See Lee [H] for the detailed process.) 
Since different circuits require different minimum baking time, the batching and 
scheduling of the integrated circuits is highly non-trivial and can greatly affect 
the production rate. 

There are many variants of the batch processing problem. The basic setting of 
the problem we considered is as follows. We are given a set J = {1, . . . , n} of jobs 
to be processed. Each job i is associated with a release time r^, which specifies 
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when the job arrives, and a processing time pi, which specifies the minimum time 
needed to process the job by a batch processor. A batch processor is capable of 
processing up to B jobs simultaneously as a batch. The time, ttj, required to 
process a batch, /, is taken as the maximum processing time of an individual job 
in the batch, i.e., ttj = pi. Our goal is to find a schedule with the smallest 

makespan, Cmax, which is the time to finish all the jobs in J. In this paper, we 
will consider the variant in which there is only one batch processor. We will 
adopt the notation of Graham et al. [8] and denote the problem as l\rj, B\Cmax- 
In comparison, the special case when all jobs arrive at the same time is denoted 
as 

1.1 Previous Related Work 

There has been a lot of research work ([9], m, i, [II, m, m. m, s, m) 
on similar or related scheduling problems. Lee and Uzsoy m advocated for 
the study of the l\rj,B\Cmax problem. They gave an 0(n^)-time algorithm for 
the special case when B = oo, extended it to polynomial time algorithms for 
other special cases, and proposed a pseudo-polynomial time algorithm when 
there are only two distinct release times. For the general l\rj, B\Cmax problem, 
however, they only proposed a number of heuristics. The general problem was 
later shown to be strongly NP-hard by Brucker et al [^ . Along with other results, 
they presented an 0(n)-time algorithm for l\rj^B = \-\Cmax-i and improved 
Bartholdi’s FBLPT algorithm [2] for the V\B\Cmax problem from 0(n log n) to 
min{0(nlog n),0{n^ / B)}. 

1.2 Our Contributions 

In this paper, we present several algorithms for the l|rj, B\Cmax problem. First 
we gave an algorithm which computes, in time 0{mn{PmaxPsumB)^~^) , a sche- 
dule with the minimum makespan, when there are n jobs, m distinct job release 
times and the max;imum and sum of processing time(s) are Pmax and Psum 
respectively. For any constant m, the running time is pseudo-polynomial. This 
generalizes and improves an algorithm of Lee and Uzsoy [TOj which takes pseudo- 
polynomial time when there are only two distinct release times. Note that both 
our algorithm and the Lee-and-Uzsoy algorithm only work for integer inputs. 
When there is no integral value restriction on the input, we obtain a fully poly- 
nomial time approximation scheme which computes a schedule with makespan at 
most (1 -I- e) times the minimum makespan in time B'^~^ / 

for any e > 0. The time is thus polynomial in n and 1/e for constant m. Finally, 
we extend this result to a polynomial approximation scheme which computes a 
(1 -I- e)-approximation in time for any e > 0. Hence 

the time is polynomial in n (although exponentially in 1/e). 
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In addition, the problem may be considered in the on-line setting where the 
job release times are unknown until they are released. We proved a lower bound 
of (\/5 -h l)/2 on the competitive ratio of any approximation algorithm. We 
show that the lower bound is tight for the case when B = oo by exhibiting an 
approximation algorithm that achieves the same ratio. 

The organisation of the rest of the paper is as follows. In section 2, we pro- 
vide the basic concepts and the FBLPT algorithm. We then present the pseudo- 
polynomial time exact algorithm in section 3 and the two polynomial time ap- 
proximation schemes in section 4. We turn to the on-line version of the problem 
in section 5. We present a lower bound for the problem and a matching upper 
bound for the special case when the batch machine has infinite capacity, i.e., 
B = oo. Finally, in section 6, we discuss the future direction of research in this 
area. Due to space limitation, most of the proofs are omitted. 

2 Terminology and the FBLPT Algorithm 

An algorithm A is a {1 + e) -approximation algorithm for a minimization problem 
if it produces a solution which is at most (1 + e) times the optimal one. A family 
of algorithms {A^} is called a polynomial time approximation scheme if, for every 
e > 0, the algorithm Ag is a (1-1- e)-approximation algorithm running in time 
polynomial in the input size when e is treated as constant. It is called a fully 
polynomial time approximation scheme if the running time is also polynomial in 
1/e. Throughout this paper, we will denote by [xj the largest integer smaller 
than or equal to x. 

Given a set of jobs J, we define Pmax and rmax as the maximum processing 
time and the maximum release time of a job respectively: Pmax = max^gjpj 
and rmax = maxjgjr^-. We define Psum as the total processing time: Psum = 
J2jejPj- ^ schedule for J is a sequence ( Ji, G), ( J 2 , ^ 2 ), • ‘ ‘ ( A, tfc) such that 
{Jii J 2 , - ■ ■ 1 Jk) is a partition of J such that, for all f = 1, 2, • • • , /c, \ Ji\ < B] 
Vj S Ji : Tj < ti, ti -\- TTi < ti+i where TTj = max{pj,j G Ji}. The makespan of 
the schedule is Obviously, the minimum makespan, C*, over all schedules 

must be at least rmax and Psum/B: C* > Taax{Psum/ B.rmax}- 

Before explaining our algorithms, we first describe the FBLPT (Full Batch 
Largest Processing Time) algorithm of Bartholdi [2] which computes the op- 
timal solution for the \-\B\Cmax problem. 

Algorithm FBLPT 

Step 1. Sort the jobs according to their processing times and re-order the indices 
so that Pi > P 2 > ■ ■ ■ > Pn- 

Step 2. Form batches by placing jobs iB I through {i 1)B together in the 
same batch for i = 0, 1, . . . , ln/B\ . 
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Step 3. Schedule the batches in any arbitrary order. 

The schedule contains at most \n/B\ + 1 batches and all batches are full 
except possibly the last one. From now on, a schedule is said to follow the 
FBLPT scheduling if it groups the jobs according to the FBLPT algorithm and 
schedules them in non-increasing order of processing time. 



3 A Pseudo-polynomial Time Algorithm for a Special 
Case 

In this section, we restrict our discussion to cases where the release times and 
processing times are integers. Let the set of jobs J have m distinct job release 
times, 0 = r[ < r '2 < ■ ■ ■ < r'^- Thus, there are jobs with the same release times 
when n > m. For technical reasons, we also define r'^^i = 00 . We index the jobs 
in non-increasing order of processing time, i.e., pi > P2 > • • • > Pn- 

Given a schedule, S, for the set of jobs J, we can partition J into m dis- 
joint subsets, Ji{S), . . . , Jm{S), such that a job is in Ji{S) if and only if it is 
scheduled at or after time r' but strictly before <+i according to S'. A key obser- 
vation is that we can locally rearrange the schedule of each subset Ji(S), without 
increasing its makespan, so that it follows the FBLPT scheduling. 

Lemma 1. For any schedule S for J with makespan Cmax, there exists a sche- 
dule S' for J with makespan such that < Cmax,' a'nd that Ji{S) = 

Ji(S') and the schedule for Ji(S') according to S' follows the FBLPT scheduling 
for any i. 

We first describe the algorithm for computing the minimum makespan. The 
adaption to finding the schedule itself is straightforward. For every 1 < i < m, 
if Ji{S) is non-empty, then there is a start time t > r' at which the first batch 
of Ji{S) is started. Inequality happens when the last batch of Ji_i(S) finished 
after time r' (even though it started before time r'). The maximum delay of 
the first batch of Ji{S) is at most Pmax- We define the delay time of Ji{S), 
denoted by bi{S), as the start time of the first batch of Ji{S) minus r'. Thus 
0 < bi{S) < Pmax- We also define the completion time of Ji{S), denoted by Cj(S'), 
as the time needed to complete all the jobs in Ji{S). Thus 0 < Cj(S') < Pgum- We 
define the size of Ji{S), denoted by ni{S), as the number of jobs modulo B in the 
last batch in Ji{S). (ui(S) ranges from 0 to B—1.) Intuitively, B—m{S) mod B is 
the number of jobs, with sufficiently small processing time, that can be added to 
the last batch without increasing the completion time of Ji{S). If Ji{S) is empty, 
we simply define bi{S), Ci{S) and nj(S') as zero. Finally, we define the makespan 
of a schedule S for jobs {!,... , j} as r'm + bm{S) + Cm{S). Hence, the makespan 
is always at least r'm- For every possible B = (62 , . . . , bm), C = (ci, . . . , Cm-i) 
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and N = (ni, . . . , Um-i), we define /(j, B, C, N) as the minimum makespan of 
a schedule S for jobs { 1, . . . , j } such that 

1. for 2 < i < m, Ji{S) has delay time bf, and 

2. for l<z<TO— 1, Ji(S') has completion time Ci and size rii; 

(We can always assume bi{S) = 0 and nm{S) can be computed by j — ni{S) — 
■ ■ ■ — nm-i{S) mod B.) Our algorithm uses a dynamic programming approach to 
compute f{j, B, C, N) for each possible (j, B, C, N). First, let us consider the 
value of /(I, B, C, N). Suppose job 1 has release time r' for some i. Then it can 
only be scheduled at or after time r'. Furthermore, if it is scheduled in the time 
interval for some i < k < m, then Jk{S) contains only job 1 and all the 

other Ji{S)’s are empty. Also Jk{S) can be completed at time + bk{S) +pi. 
Hence for i < k < m and for all bk, if we let 



A;— 1 m—k 

Bk = 

k m—k—1 

= (Oj • • • ) 0) + Pi) 0; • ■ • ) 0) 

k m — k—1 

Nk = (b,o,.^,i,'6r^), 

and take Bi — B, Cm = C and Nm = N , then we have 

/(l,Bfe,Cfc, Affc) = max{r'fc + bk+Pi,r'm}- 

For the other possible values of (JB, C, N), we set /(I, B, C, N) = oo. 

Now suppose the value of /(/, B, C, N) has been computed for every j' < j 
and every possible (B,C,N). Consider f{j,B,C,N). Again, let job j have 
release time rj = r' for some i. Then it can only be scheduled at or after time r'. 
Suppose it is inserted into Jk{S) for some k where i < k <mhy some schedule 
S. By LemmaH] we can assume that Jk{S) follows the FBLPT scheduling. Since 
job j has the smallest processing time among all jobs in Jk{S), it is inserted into 
the last batch of Jk{S). 

Imagine that the schedule for jobs {1, . . . , j — 1} has been determined by 
S and now job j is added to the existing schedule. There are nk jobs in the 
last batch of Jk{S) after adding job j. If Uk > 1, then the last batch of Jk{S) 
is non-empty before inserting job j. Hence adding job j does not increase the 
completion time of Jk{S). (Remember that the processing time, pj, of job j is 
no more than the processing time of the last batch of Jk{S) by the order we 
consider the jobs.) If Uk = 1, then the last batch of Jk is either full or does not 
exist at all before inserting job j. Hence inserting job j increases the completion 
time of Jk by pj. If fc = m, then the makespan will also be increased. 
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We are now ready to describe the recursive relation for the dynamic pro- 
gramming. Recall that C = (ci, . . . , Cm-i) and N = (ni, . . . , rim-i)- Then 
for i < fc < m, define Ck = (ci, . . . ,Ck-i,Ck - Pj,Ck+i, ■ ■ ■ Nk = 

(ni, . . . , Uk-ijUk — 1 mod B, Uk+i, ■ • ■ , rim-i)- For i < k < m, define 

( f{j-l,B,C,Nk) iink>l 
fk = I f{j - 1, B, Ck, Nk) if rifc = 1 and Ck > Pj and k < m 
I, oo if rifc = 1 and Ck < Pj 

and 

^(f{j-l,B,C,N) ifn„>l 
\f{j-l,B,C,N)+p, ifn„ = l 

Then we have 



f{j, B, C, N) = min{ fk\i<k<m} 

The minimum makespan of the original problem can be computed by ta- 
king the minimum f{n,B,C,N) over all possible {B, C,N). As described 
before, 0 < < Pmax, 0 < Ci < Psum and 0 < rii < i? — 1. Therefore, there 

are 0{n{PmaxPsumB)^~^) sets of possible input values. For each set, it takes 
0{m) time to compute the value of f{j,B,C,N). Hence the algorithms ta- 
kes time 0{mn{PmaxPsumB)^~^). If the number of distinct job release times, 
m, is constant, then the algorithm runs in pseudo-polynomial time. Hence the 
l|rj, B\Cmax problem is not strong NP-hard in this special case. 

4 Approximating Optimal Makespan in Polynomial Time 

The previous algorithm solves the \\vj,B\ Cmax problem optimally when the 
input values are integers. We now extend this algorithm to a fully polyno- 
mial time approximation scheme for the same l|rj, B\Cmax problem but without 
the integral value restriction. The algorithm AMM (Approximate-Minimum- 
Makespan) accepts as input a set of jobs J with release times and processing 
times, and an accuracy parameter e; and outputs a schedule with makespan at 
most (1 -|- e) times the minimum makespan. 

Algorithm AMM(J, e) 

Step 1. Let Mo = max, Pmax} and M = 

Step 2. Round down all numbers which appeared in the input to the nearest 
multiples of M to obtain another input instance J = {1, . . . ,n} such that 
job j has release time fj = [r^/Mj and processing time pj = [pj/M\. 

Step 3. Compute an optimal schedule S for the rounded down input ( J, B) using 
the algorithm in previous section. 
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Step 4. Return S as the schedule for the original input. 

Note that when applying schedule S on both J and J, the start time of each job 
in J may be later than that of the corresponding job in J. This is because each 
number in the instance J can be smaller than the corresponding number in the 
instance J. 

Theorem 1. The AMM algorithm is a fully polynomial approximation scheme 
for the l\rj, B\Cmax problem when there are a constant number of distinct job 
release times. 

Next we extend the above algorithm to a polynomial time approximation 
scheme for the l\rj, B\Cmax problem in which the number of distinct release 
times, TO, is non-constant. The idea is to approximate the (at most) n distinct 
release times using a constant number of distinct release times. 

Algorithm AMM 2 (J, e) 

Step 1. Let K = (e/2) 

^max- 

Step 2. Round down each rj to the nearest multiple of K, that is, set fj = 
K\rj/K\. Note that there will be at most 2/e distinct release times in the 
rounded down problem. 

Step 3. Use the (1 -I- e/2) -approximation scheme for the rounded problem to 
obtain a schedule. 

Step 4. Schedule the jobs for the original problem by the ordering obtained in 
step 3. 



Theorem 2. The AMM 2 algorithm is a polynomial time approximation scheme 
for the general l\rj,B\Craax problem. 

5 The On-Line Scheduling Problem 

In this section, we study the on-line version of the problem. In particular, we 
consider the situation in which we have absolutely no information about the 
jobs. We do not know how many jobs there are and for each job i, we do not 
know its processing time pi until it arrives, i.e., at time rt which also is unknown 
until the job arrives. We also do not allow pre-empting a scheduled batch. (Note 
that allowing pre-emption could help a lot.) Due to the lack of information 
concerning the future, no on-line algorithm performs very well in the worst case. 
We found that the worst case competitive ratio of any on-line algorithm is at 
least (-\/5-|- l)/2, the golden ratio. For convenience, we will let 5 = (\/5 — l)/2 
throughout this section. 
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Theorem 3. Any on-line algorithm has a worst-case competitive ratio of at least 
l + S= (V5+l)/2« 1.618. 

Proof: (Sketch) We will construct a worst case input with one or two jobs 
according to the actions taken by the algorithm. Job 1 with processing time 
Pi = 1 will arrive at time zero. Then the on-line algorithm will schedule it at 
some time Si (which could be zero). If Si > S, then the adversary will not give 
any more job. If si < S, then the adversary will give a second job with processing 
time p 2 = 1) which arrives at time si -I- e for some small value e > 0. In both 

cases, the competitive ratio is at least 1 -I- □ 

Theorem 4. When the capacity B of the machine is infinite, there exists an 
on-line algorithm with competitive ratio 1 -I- J. 

Proof: The idea of the algorithm is as follows. At any current moment t, the 
algorithm maintains a commence time Sft) = maxjg[/(i){(l -|- S)rj -\- 6pj} where 
U (t) is the set of jobs available and remained unscheduled at time t. If no new 
jobs arrived before S{t), then it will schedule all jobs of U{t) to run at time 

S{t). Otherwise, if a job i arrives at time t' such that t < t' < S{t), then it 

computes the possibly new commence time S(t') = maxjgj/p/){(l -|- S)rj -\- Spj} 
where U{t') = U{f) U {z}. Then it waits until time S{t') to schedule all jobs in 
Uft') or until another new job arrived upon which it will change its plan again. 

To see that the algorithm achieves the stated competitve ratio, consider the 
last batch in the schedule obtained by the algorithm and trace backward in time 
to the first idle time (or to the first batch of the schedule if there is no idle 
time). Let the sequence of batches be (i?i, B 2 , ■ ■ ■ , Bk) where B\ is the earliest 
batch and Bk the last batch in this sequence. Let Si be the start time of batch 
Bi. Obviously, Si < for 1 < z < A:. Let job z be the one that maximizes 
(1 -I- (5)rj -I- 5pj over all jobs j in Bi. (Recall that and pi are the release time 
and processing time of job z respectively.) By construction of our algorithm, the 
batch Bi started exactly at time (1 J- J)ri J- 5p\ and a subsequent batch Bi, 
z > 1, may need to wait until the previous batch i?i_i finished. Thus, we have 

Si = (1 -I- J)n J- 5pi, 
and Si > (1 J- 5)ri -\- Spi for 2 < z < fc. 

Also, job z J- 1 must arrive after batch Bi has been started. Therefore, 

Ti+i > Si for 2 < z < fc. 

Combining these we obtain the following lower bound on the Sj’s: 

Sfc > (1 + S)rk + Spk 

> (1 J- J)sfc_i J- Spk 

> (1 J- S)^rk-i J- (1 J- S)6pk-i + Spk 

> (1 + S)^ri J- (1 J- S)^ ^Spi J- ■ • • J- (1 J- S)Spk-i + Spk 
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Since the machine is busy from time si onwards until all the batches B\, , Bk 
are finished, the makespan of our algorithm is 

Cmax = Si + Pi + P2 + ■ ■ ■ + Pk 

= (1 + (5)(ri +pi)+p 2 -\ \-Pk 

On the other hand, the minimum makespan C* has to satisfy 

C* > Tfc + Pfe 

> Sfc_i + Pk 

> (1 + S)^ ^Ti + i5pfc_i(l + S)Spk-2 + • • • + (1 + S)^ ^Spi + Pk 

Therefore we have 

C ^ (1 + (5)ri + (1 + 5)pi +P2 +P3 H hPfc 

C* ~ {I + + (1 + S)'^-^Spi H 1- (1 + S)Spk-2 + Spk-i +Pk 

If fc = 1, then 

C ^ (1 + S)ri + (1 + ^)pi 
C* - ri+ Pi 
<l + S 

If A: > 3, then observe that (I + 5)ri/(I + < 1, (I + d)pi/(I + < 

I + d, ... ,Pk/Pk < 1 + (5 making use of the fact that d(I + (5) = I. 

For fc = 2, we need an additional observation. The optimal schedule has to 
schedule the 2 batches Bi and B 2 together at time T 2 , if it is to behave differently 
from our algorithm. Therefore, C* > r 2 + max{pi,p 2 }. If pi < P 2 , then 

Cmax ^ (1 + ^)pi +P 2 , , , r 

If Pi > P 2 , then 

Cmax ^ (I + 5)pi + P 2 ^ (2 + S)pi ^ C 

C* ~ Spi+pi ~{l + S)pi~ 

Hence 

C* 

in any case. □ 

6 Remarks and Discussion 



Batch processing is an interesting problem with practical applications which has 
started to attract studies in its algorithmic aspects recently. A lot of questions 
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are still open and more research efforts are needed. The on-line version of the 
problem is particularly interesting. While we have the exact bound for the case 
B = oo, the gap has yet to be bridged when B is bounded. One can easily obtain 
a 2-competitive ratio algorithm in this case but the best lower bound is 1 -I- <5. 
Our conjecture is that the exact bound is 1 -I- the golden ratio. 
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Abstract. In this paper, we study the metric property of LexBFS- 
ordering on AT-free graphs. Based on a 2-sweep LexBFS algorithm, we 
show that every AT-free graph admits a vertex ordering, called the strong 
2-cocomparability ordering, that for any three vertices u ^ v ^ w in the 
ordering, if d{u, w) <2 then d{u, v) = 1 or d{v, w) < 2. As an application 
of this ordering, we provide a simple linear time recognition algorithm for 
bipartite permutation graphs, which form a subclass of AT-free graphs. 



1 Introduction 

In past years, specific orderings of vertices characterizing certain graph classes 
are studied by many researchers. Usually, these ordering can be described from 
a metric point of view and the metric associated with a connected graph is, of 
course, the distance function d, giving the length of a shortest path between 
two vertices. One of the first results is due to Rose (19] for recognizing chordal 
graphs. A graph is chordal if every cycle of length at least four has a chord. A 
proof in [19] showed that every chordal graph G admits a perfect elimination 
ordering, i.e., an ordering ui, i> 2 , . . . , of the vertices of G such that for every 
i < j < k, li d{vi,Vj) = d{vi,Vk) = 1, it implies that d{vj,Vk) = 1. Here the 
vertex Vi, i = 1, . . . ,n,\s called a simplicial vertex of the subgraph of G induced 
by the set {vi, . . . ,Vn}- Note that a vertex is simplicial in a graph if and only 
if it is not a midpoint of every induced path on three vertices. The perfect 
elimination ordering of a chordal graph can be computed in linear time from the 
Lexicographic Breadth-First Search (LexBFS) algorithm [2D] or the Maximum 
Cardinality Search (MCS) algorithm [2D] . 

Another two well-known classes of graphs are comparability graphs and co- 
comparability graphs. A graph G is a comparability graph if its vertex set has a 
transitive ordering] i.e., an ordering Vi,V 2 , ■ ■ ■ ,Vn of the vertices of G such that 
for every i < j < k,\i d{vi, Vj) = d{vj,Vk) = 1, then it implies that d{vt, Vjf) = 1- 
There is an O(n^) time algorithm |2T] to test if a graph is a comparability 

* This work was supported by the National Science Council, Republic of China under 
grant NSC89-2213-E-008-007. 
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graph. In the case of a positive answer, the algorithm also produces a transitive 
ordering. A cocomparability graph is the complement of a comparability graph, 
or equivalently, its vertex set has a cocomparability ordering, i.e., an ordering 
Vi,V 2 , ■ ■ ■ ,Vn of the vertices such that for every i < j < k, if d{vi,Vk) = 1, it 
implies that d{vi,Vj) = 1 or d{vj,Vk) = 1. Most efficient algorithms on compa- 
rability and cocomparability graphs are developed from these vertex orderings. 

Besides, some generalizations of these well-known orderings have been inve- 
stigated. Jamison and Olariu m generalized the concept of perfect elimination 
ordering in the following way: a vertex is semi-simplicial in a graph if it is not 
a midpoint of every induced path on four vertices. An ordering vi,...,Vn of 
the vertices in a graph G is a semi-perfect elimination ordering if and only if 
Vi is semi-simplicial in the subgraph of G induced by the set {vi, . . . , v„}. They 
also characterized the classes of graphs for which every ordering produced by 
LexBFS or MCS is a semi-perfect elimination ordering. Hoang and Reed m 
defined that a graph G is P^- comparability if it admits a vertex ordering which 
is transitive when restricted to any P 4 . Thus, this class of graphs generalizes 
comparability graphs in a natural way. In j^, we generalized the concept of co- 
comparability ordering as follows. Let t > 1 be an integer. A t- cocomparability 
ordering (abbr. t-CCPO) of a graph G is an ordering v\,V 2 , ■ ■ ■ ,Vn of the vertices 
such that for every i < j < k, if d{vi,Vk) < t, it implies that d{vi,Vj) < t or 
d{vj,Vk) <t. A graph is called a t- cocomparability graph if it admits a t-CCPO. 
Thus, G is a t-cocomparability graph if and only if every powers G'* for s > t 
is a cocomparability graph (where G® is the graph with the same vertex set as 
G such that two vertices are adjacent if and only if their distance in G is at 
most s). Indeed, a proof in [4] showed that a vertex ordering of G is a t-CCPO 
if and only if it is a cocomparability ordering of G® for s > t, and determining 
a graph to be a t-cocomparability graph for the smallest integer t can be solved 
in 0{M{n) logn) time, where M(n) denotes the time complexity of multiplying 
two n X n matrices for integers. In particular, the 2-cocomparability graphs can 
be recognized in 0{M(n)) time. 

An asteroidal triple (or AT for short) of a graph is an independent set of three 
vertices such that every pair of vertices are joined by a path avoiding the closed 
neighborhood of the third. Graphs without asteroidal triple are called AT-free 
graphs. Lekkerkerker and Boland m first introduced the concept of asteroidal 
triples to characterize the interval graphs. A graph is an interval graph if and only 
if it is chordal and AT-free. Golumbic et al. [10] showed that cocomparability 
graphs (and thus permutation and trapezoid graphs) are also AT-free. The best 
known recognition algorithm for AT-free graphs requires O(n^) time 0- The 
polynomial time algorithms for solving stability and various domination-type 
problems on AT-free graphs can be found in [,S|6|8n . Besides, some algorithmic 
problems such as treewidth m, minimum fill-in m and vertex ranking m on 
AT-free graphs are known to be NP-complete. 

Recently, Gorneil, Olariu and Stewart |7] obtained a collection of interesting 
structural properties for AT-free graphs. However, up to now nice characterizati- 
ons of AT-free graphs such as a geometric intersection model and an elimination 
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scheme are not known. It would be interesting to see whether the AT-free graphs 
also possess some vertex ordering which is useful for algorithmic purposes. Based 
on a decomposition property, called the involutive sequence, proposed by Cornell 
et ah, in [4] we showed that every AT-free graph possesses a 2-CCPO, and the 
class of AT-free graphs is properly contained in the class of 2-cocomparability 
graphs. Consequently, every proper powers {k > 2) of an AT-free graph G is a 
cocomparability graph. This result implies that the fc-domination and fc-st ability 
problems for fc > 2 on AT-free graphs can be solved by a more efficient way. 

In this paper, we continue this work by investigating the metric property of 
LexBFS-ordering on AT-free graphs. We prove that every AT-free graph has a 
vertex ordering, called the strong 2-CCPO, which is stronger than the 2-CCPO 
and is also a generalization of the cocomparability ordering. In particular, we 
show that this ordering can be generated by a 2-sweep LexBFS algorithm. The 
concept of strong 2-CCPO and the 2-sweep LexBFS will be introduced in the 
next section. Moreover, based on a modified 2-sweep LexBFS algorithm, we 
show that the strong 2-CCPO can be used to design a recognition algorithm for 
bipartite permutation graphs in O(n-l-m) time. This approach is easier than the 
algorithm of m- 

2 AT-Free Graphs Are Strong 2-Cocomparability Graphs 

All graphs considered in this paper are undirected, simple (i.e., without loops 
and multiple edges) and connected. Let G = (V, E) be a graph with vertex set 
V of size n and edge set E of size m. The distance of two vertices u,v € V, 
denoted by dc{u, v), is the number of edges of a shortest path from m to u in G. 
When no ambiguity arises, the subscript G can be omitted. A path joining two 
vertices u and v is termed a u-v path. The union of two paths P and P' with a 
common endpoint is denoted by P© P' . A vertex u misses a path P if there are 
no vertices of P adjacent to u; otherwise, we say that u intercepts P. The open 
neighborhood N{u) of a vertex u € V is the set {v € V : (u,v) S E}-, and the 
closed neighborhood A^[u] is N{u) U {m}. Notations and terminologies not given 
here may be found in any standard textbook on graphs and algorithms. 

We first introduce the notion about the strong t-cocomparability ordering. 
Let t > 1 be an integer. A strong t- cocomparability ordering (abbr. strong t- 
CCPO) of a graph G is an ordering ui, U 2 , . . . , of P such that for every i < j < 
k, if d{vi,Vk) < t, it implies that d{vi,Vj) = 1 or d{vj,Vk) <t. A graph is called 
a strong t- cocomparability graph if it admits a strong t-CCPO. Clearly, every 
cycle Ct +3 {t > 1) admits a strong t-CCPO, and every strong t-cocomparability 
graph is a t-cocomparability graph, but the converse is not true. For instance, 
the even cycle G 2*+2 (t > 2) is a t-cocomparability graph and it has no strong 
t-CCPO. Also, it is easy to see that all the classes of strong t-cocomparability 
graphs, t > 1, constitute a hierarchy by sets inclusion. 

The LexBFS was designed to provide a linear-time recognition algorithm 
for chordal graphs |2D]. According to LexBFS, the vertices of a graph G are 
numbered from n to 1 in decreasing order. The label L(u) of an unnumbered 
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vertex u is the list of its numbered neighbors in the current search. As the next 
vertex to be numbered, select the vertex v with the lexicographically largest 
label, breaking ties arbitrarily. The algorithm runs in 0{n + m) time. Below we 
reproduce the details of LexBFS that begins from a distinguished vertex v of G. 

Procedure LexBFS(G, u) 

Input: a connected graph G = {V, E) and a vertex v gV. 

Output: a numbering a of the vertices of G. 
begin 

L{v) G- n; 

for each vertex u G H\{u} do L{u) G- 0; 

for i n downto 1 do begin 

choose an unnumbered vertex u with the lexicographically largest label; 
(t(m) g- i; {assign the number i to vertex u} 
for each unnumbered vertex w G N(u) do 
append i to L{w); 

end 

end. 

Let (V, -<) be the vertex ordering corresponding to the numbering a produced 
by LexBFS(G, u). For any two vertices a,b G V, we write a ^ b {or b y a) 
whenever a{a) < a{b) and we shall say that b is larger than a or a is smaller 
than b. In addition, if a is no larger than b, we simply write a <b (or b'G a). For 
convenience, the subgraph of G induced by the vertex set {w G V : w ^ u} is 
denoted by G[u]. Note that, if G is a connected graph, then G[u] is also connected 
for every u G V. 

A dominating pair in a graph is a pair of vertices, such that for any path 
connecting these two vertices, all remaining vertices of the graph are in the neigh- 
borhood of this path. Recently, an algorithm called the 2-sweep LexBFS was pro- 
posed by Cornell et al. [S] for finding dominating pairs of a given connected AT- 
free graph G = {V,E). Their algorithm works as follows. First, start a LexBFS 
from an arbitrary vertex v G V . Let z be the vertex with the smallest number 
in this search. Again, start a new LexBFS from z. Let (V, ^) = (y ^ ^ 0 ) be 

the ordering of vertices produced by LexBFS(G, z). It is shown in [8] that y and 
z constitute a dominating pair of G. The correctness of their algorithm is based 
on the following property. 

Lemma 1. (Cornell et al. [5]) Let G = {V,E) be an AT-free graph and (V, 
be the vertex ordering of G produced by a 2-sweep LexBFS. Then, for every two 
vertices u ^ w, w intercepts all u-z paths in G, where z is the largest vertex of 

Through the rest of this paper, we assume that G = {V, E) is a connected 
AT-free graph and (R, ^) = (?/ ^ • • • ^ z) is a vertex ordering of G produced by a 
2-sweep LexBFS. For each vertex u G V, the largest vertex in iV[M] is denoted by 
ln{u). Note that, since G[u] is connected, ln{u) >- u for all z, and ln{z) = z. 
We now show that {V, is a strong 2-CCPO of G. 
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Lemma 2. Let a ^ b ^ c be any three vertices of G that b misses a path from 
a to c. Then ln{b) = ln{c). 

Proof. Since b ^ c, clearly ln(b) A ln{c). We now prove ln(b) = ln(c) by contra- 
diction. Suppose the contrary that ln{b) -< ln{c). Let Q be an a-c path missed 
by b and P be a ln{c)-z path in G[ln{c)]. Since ln{b) -< ln{c), b misses P. Thus, 
b misses the a-z path Q © {c,ln{c)) © P. However, it contradicts to Lemma [U 
that b intercepts all a-z paths in G. □ 



Lemma 3. The ordering (V, a) is a strong 2-CCPO of G. 

Proof. Let a A 6 A c be any three vertices of G that d{a, c) < 2. If 6 intercepts a 
shortest path from a to c, it is easy to see that (a, b) G E or d{b, c) < 2. On the 
other hand, if b misses any shortest path from a to c, by Lemma |2] ln{b) = ln{c). 
Thus (6, ln{c), c) is a path in G and d(6, c) < 2. □ 

Fig. 1 shows a strong 2-cocomparability graph with a strong 2-CCPO a -< 
b~<c~<d<e~<f and it contains {a,b,f} as an AT. Indeed, every vertex 
ordering produced by a 2-sweep LexBFS in the graph is a strong 2-CCPO. Thus 
we have the following result. 



f 

Fig. 1. A strong 2-cocomparability graph which is not AT-free. 

Theorem 1. AT-free graphs are properly contained in the class of strong 2- 
cocomparability graphs. Furthermore, a strong 2-CCPO of an AT-free graph can 
he produced by a 2-sweep LexBFS algorithm in 0{n + m) time. 

Since a strong 2-CCPO is a 2-CCPO and G is a 2-cocomparability graph if 
and only if every power G^ {k > 2) is a cocomparability graph. Lemma also 
implies that all proper powers of AT-free graphs are cocomparability graphs. 




3 Recognition of Bipartite AT-Pree Graphs 

A graph is bipartite if its vertex set can be partitioned into S and T such that 
each edge has one end in S and the other end in T. Let G = {S, T, E) denote a 
bipartite graph. An ordering of S has the adjacency property if for each vertex 
t G T, N{t) consists of consecutive vertices in the ordering of S. An ordering 
of S has the enclosure property if for every pair of vertices t,t' G T such that 
N{t) C Nit'), N {t')\N it) contains consecutive vertices in the ordering of S. A 
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strong ordering {S U T, is a combination of an ordering of S and an ordering 
of T such that for any two edges (s',t) G E, where s,s' £ S with s s' 

and t,t' € T with t -< t', it implies that (s,t) G E and {s',t') G E. 

A graph is a permutation graph if there exist two permutations of its vertices 
such that two vertices, say u and v, are adjacent if and only if u precedes v 
in one permutation and v precedes u in the other. Indeed, the orderings of 
vertices corresponding to these two permutations are both transitive ordering 
and cocomparability ordering. Thus, a graph G is a permutation graph if and 
only if both G and its complement are comparability graphs P|. A bipartite 
permutation graph is both a bipartite graph and a permutation graph. This class 
of graphs was studied by Spinrad et al. m and can be characterized as follows. 

Theorem 2. (Spinrad et al. 1221 1 The following statements are equivalent for a 
bipartite graph G = {S, T, E) . 

(1) G is a bipartite permutation graph. 

(2) There is a strong ordering of S and T. 

(3) There exists an ordering of S (orT) which has the adjacency and enclosure 
properties. 

Note that, it can be seen from m that if an ordering of S and T is a strong 
ordering, then both the ordering of S and the ordering of T have the adjacency 
and enclosure properties, provided that all isolated vertices of G appear at the 
beginning of the orderings of S and T. However, if orderings of S and T satisfy 
the adjacency and enclosure properties, it does not imply that the combination of 
these two orderings forms a strong ordering of G. Based on the characterizations 
described in Theorem[21 Spinrad et al. developed an 0(n + m) time algorithm 
for recognizing bipartite permutation graphs. 

A graph is bipartite AT-free if it is a bipartite graph and does not contain 
an AT. It is clear that a bipartite permutation graph is bipartite AT-free. In [^, 
Gallai characterized the comparability graphs and the cocomparability graphs 
(and thus the permutation graphs) in term of a complete list of forbidden induced 
subgraphs. Since every bipartite forbidden structure from Gallai’s list always 
contains an AT, we conclude that a bipartite AT-free graph must be a bipartite 
permutation graph. In fact, the following observation is mentioned in a recent 
book of Brandstadt et al. 

Proposition 1. (Brandstadt et al. |T]) G is a bipartite permutation graph if 
and only if G is bipartite and G contains no asteroidal triple. 

Due to the fact that bipartite permutation graphs are exactly bipartite AT- 
free graphs, we shall borrow from the notion of AT-free to provide a new re- 
cognition algorithm for this class of graphs. Since there is a simple linear time 
algorithm to determine whether a graph is bipartite, we can restrict our atten- 
tion to bipartite graphs. Recall that the recognition algorithm provided in |22] 
maintains a data structure which is used to represent all possible orderings of S 
satisfying the adjacency and enclosure properties. Indeed, their algorithm is si- 
milar to the algorithm of Booth and Lueker for testing consecutive arrangement 
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12, where enclosure adds another type of constraint. If the input graph is a per- 
mutation graph, their algorithm generates an ordering which has the adjacency 
and enclosure properties. For otherwise, this algorithm may still produce a can- 
didate ordering. Thus, they also provided a way to verify whether this ordering 
has adjacency and enclosure properties. 

Our recognition algorithm has the same time complexity as the algorithm of 
p2| . These two algorithms are different in approach for generating the vertex 
ordering. We first use a modified 2-sweep LexBFS as a procedure to construct 
a vertex ordering. Note that, if a breadth-first search is applied to a bipartite 
graph G = {S,T,E), then we can immediately determine the orderings of S 
and T, respectively. We then show that G is bipartite AT-free if and only if 
both the orderings of S and T generated from the above procedure have the 
adjacency and enclosure properties. So we can simply examine the adjacency and 
enclosure properties in this specific ordering. Before introducing the modified 
2-sweep LexBFS algorithm, we establish some basic properties. Assume that 
(S' U T, ^) = (y ^ ^ z) is an ordering of the vertices produced by a 2-sweep 

LexBFS on a bipartite AT-free graph G. Then we have the following lemmas. 

Lemma 4. For any s ^ s' ^ t, s, s' G S and t G T, if {s, t) G E then (s', t) G E. 

Proof. By Lemma[3l if s A s' A t and (s, t) G E, then d(s, s') = 1 or d(s', t) < 2. 
Since (s,s') ^ E and the distance d(s',t) is odd, the result follows. □ 

By the symmetry of S and T, we also have the fact: for any three vertices 
t A t' A s, s S S and t, t' G T, if (s, t) G E then (s, t') G E. 

Define F(u,v) = {w G N(u) : v ^ w} for u,v G S U T, and let j(u,v) = 
|T(m,u)|. In the following two lemmas, we assume s, s' G S and t,t' G T with 
s < s' and t -< t' . 

Lemma 5. If (s,t'), (s' ,t) G E, then (s',t') G E. 

Proof. Clearly, either s A s' A t' or t ^ t' ^ s' must be true. By Lemma E] 
(s,t') G E implies (s',t') G E for the former case, and (s' ,t) G E implies 
(s' ,t') G E for the later case. □ 



Lemma 6. Suppose s t, (s,t'),(s' ,t) G E and (s,t) ^ E. Then, F(t,s) C 

r(t',s) and r(t,t') = r(t' ,t'). 

Proof. By definition, every vertex in P(t,.s) is larger than s. Since t ^ t' and 
(s,t') G E, by Lemma E] every vertex w G P(t,s) must be adjacent to t' . Thus, 
s) C P(t',s). Note that it implies P(t,u) C P(t',u) for all s ^ u. In parti- 
cular, P(t,t') C P(t',t'). 

It remains to prove that P(t,t') D P(t',t'). Suppose the contrary that there 
exists a vertex s" in P(t',t') but not in P(t,t'). Consider the three vertices 
s ~<t s". Since d(s, s") < 2 and (s, t), (s", t) ^ E, t misses every shortest path 

from s to s". By Lemma El ln(t) = ln(s"). It is a contradiction because s" and 
t have no common neighbor. □ 
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A symmetric statement is: if t ^ s, (s, t'), (s', t) G E and (s,t) ^ E, then 
r{s, t) C _r(s', t) and A(s, s') = A(s', s'). 

To obtain a strong ordering of a bipartite AT-free graph, we modify the 
second sweep of a 2-sweep LexBFS as follows. Let G = (V,E) be a graph. In 
the second sweep of LexBFS procedure, we first initialize a variable f{v) for 
each u G H to be the degree deg{v) of v. In each step of the LexBFS, we select 
a vertex to number. The selected vertex is a vertex with the lexicographically 
largest label. If there are more than one vertices with the same largest label, a 
vertex among those vertices with the smallest value / is selected. After a vertex 
V has been numbered, we update f{w) to be f(w) — 1 for each vertex w G N(v). 
In other word, when u is about to be numbered, for all v € V with v ^ u, f{v) 
denotes the degree of v in the subgraph of G induced by the set {w G H : w u}. 
Notice that the modified 2-sweep LexBFS is a special 2-sweep LexBFS. Thus, 
Lemmas H E and El are still true for the ordering produced by the modified 
algorithm in a bipartite AT-free graph. 

Lemma 7. Any ordering of the vertices of a bipartite AT-free graph produced 
by a modified 2-sweep LexBFS algorithm is a strong ordering. 

Proof. Let G = (S,T,E) be a bipartite AT-free graph and {S U T, ^) be the 
vertex ordering of G that is produced by a modified 2-sweep LexBFS algorithm. 
Suppose the contrary that {S U T, ^) is not a strong ordering of G. That is, 
there exist vertices s, s' G S and t,t' G T with s ^ s' and t < t' such that 
(s, f'), (s', t) G E and (s,t) ^ E. Without loss of generality assume s ^ t. By 
Lemmas!^ we have F{t,s) C F{t',s) and E{t,t') = F{t',t'). Thus, 7 (t, s) < 
7 (t',s) and 'y{t,t') = 7 ( 7 ', f'). Since F{t,t') = T(t',f'), when t' is about to be 
numbered, the labels Lft) and L{t') are the same. At this time, f{t') < f{t) by 
the fact that t <t' and the selecting rule of the modified 2-sweep LexBFS. Thus, 

degft') = f{t') + -i{t' ,t') < /(t) -h 7 (t,t') = deg{t). 

On the other hand, we consider the values f{t) and f{t') when s is about to be 
numbered. Since deg{t) > deg{t') and < 7 (t',s), it implies f{f) > f{t') 

when s is about to be numbered. Since (s,t') G E and (s,f) ^ E, there is some 
vertex s* G S' with s* ^ s such that it is adjacent to t in G. Since s* ^ s ^ t, 
by Lemma [H (s*, t) G E implies (s,t) G E, which is a contradiction. □ 

Theorem 3. A bipartite graph G = (S, T, E) is AT-free if and only if the mo- 
dified LexBFS algorithm generates an ordering of G such that both the orderings 
of S and T have the adjacency and enclosure properties. 

Proof. The “only if” part directly follows from Lemma |7] and the fact, a strong 
ordering implies that both S and T have the adjacency and enclosure properties. 
Conversely, if G is not a bipartite AT-free graph, by Theorem none of the 
orderings of S and T satisfy the adjacency and enclosure properties. □ 

In order to use the modified 2-sweep LexBFS algorithm to recognize bipartite 
AT-free graphs, we need an efficient method to examine whether or not a specific 
vertex ordering produced by the algorithm has the adjacency and enclosure 
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properties. In fact, Spinrad et al. |22j have proposed an 0{n + m) time algorithm 
for this work. To show that the total complexity of our implementation is 0(n-|- 
m) time, we now consider the time it takes in the second sweep of LexBFS. 
Recall that the LexBFS algorithm presented in does not actually calculate 
the labels, but rather keep the unnumbered vertices in lexicographic order. All 
unnumbered vertices are placed in a number of separate sets Ui, each of which 
contains those vertices of the same label I in the current search. Ui is represented 
by a doubly linked list. Keep the sets in a queue ordered lexicographically by label 
from smallest to largest. Initially, the queue contains only one set C /0 = K\{ 2 ;}. 
When a vertex u is selected to number, for each set U containing a vertex w G 
N{u), create a new set U' which is inserted in the very front of U in the queue. 
Then move all such vertices w from U to the new set U' . This method maintains 
the lexicographic ordering without actually calculating the labels. For yielding 
the modified LexBFS, in the above implementation of LexBFS, we initially sort 
the adjacency list for each vertex of G and the vertices of C/g according to the 
degrees of vertices in increasing order. Then, in each stage all vertices of each 
new set U' are kept in the sequence same as U and the vertex with the smallest 
value / is always located in the front of its list. Since computing the degrees 
of vertices and sorting the adjacency list of each vertex of G, and C/g (using a 
radix sort) takes 0(n -I- m) time, the modified algorithm can be implemented in 
0(n -I- m) time. 

4 Concluding Remarks 

In this paper, we investigate the metric property of LexBFS-ordering on AT- 
free graphs. We show that for an AT-free graph G, the ordering produced by a 
2-sweeps LexBFS algorithm is a strong 2-CCPO, which generalizes the concept 
of cocomparability ordering. This result also implies that every proper power 
of an AT-free graph is a cocomparability graph. In particular, if G is a bipar- 
tite permutation graph, a slight modification of the algorithm can construct a 
strong ordering for such a graph. This suggests a simple linear time recogni- 
tion algorithm for bipartite permutation graphs. As a final comment, we note 
that the 2-sweep LexBFS is a valuable tool for solving algorithmic problems in 
AT-free graphs, such as finding dominating pairs [S|, constructing tree spanners 
[II 6j . and examining diameter j^. In Section [2l we have given an example to 
show that a graph containing AT may produce a strong 2-CCPO by a 2-sweep 
LexBFS algorithm. Can one characterize the graphs for which every ordering of 
the vertices produced by a 2-sweep LexBFS is a strong 2-CCPO? Also, we ask 
whether or not there are efficient recognition algorithms for the class of strong 
t-cocomparability graphs where t > 27 
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Abstract. In this paper, we consider parallel algorithms for shortest pa- 
ths and related problems on trapezoid graphs under the CREW PRAM 
model. Given a trapezoid graph with its corresponding trapezoid dia- 
gram, we present parallel algorithms solving the following problems: For 
the single-source shortest path problem, the algorithm runs in O(logn) 
time using 0{n) processors and space. For the all-pair shortest path query 
problem, after spending O(logn) preprocessing time using 0(n log n) 
space and 0{n) processors, the algorithm can answer the query in 0(log 5) 
time using one processor. Here S denotes the distance between two que- 
ried vertices. For the minimum cardinality Steiner set problem, the algo- 
rithm runs in O(logn) time using 0(n) processors and space. 

We also extend our results to the generalized trapezoid graphs. The 
single-source shortest path problem and the minimum cardinality Stei- 
ner set problem on d-trapezoid graphs and circular d-trapezoid graphs can 
both be solved in 0(log n log d) time using 0{nd) space and 0{d^nj log d) 
processors. The all-pair shortest path query problem on d-trapezoid gra- 
phs and circular d-trapezoid graphs can be answered in 0(d log 5) time 
using one processor after spending 0(log n log d) preprocessing time using 
0(nd log n) space and 0{d?n/ logd) processors. 



1 Introduction 

The intersection graph of a collection of trapezoids with corner points lying on 
two parallel lines is called the trapezoid graph [5]. Note that trapezoid graphs 
are perfect and properly contain both interval graphs and permutation graphs. 
Trapezoid graphs are perfect since they are cocomparability graphs. 

The single-source shortest path (SSSP) problem is the problem of finding the 
shortest paths between a given vertex and all other vertices. The all-pair shortest 
path problem is the problem of finding the shortest paths between all pairs of 
vertices. In stead of finding all pairs of shortest paths, the all-pair shortest path 
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query (APSPQ) problem is described as follows: First, apply a faster preproces- 
sing algorithm and construct a data structure. Using the data structure, a query 
on the length of a shortest path between any two vertices can be answered very 
fast. 

In 1^, Ibarra et al. proposed an O(logn) time parallel algorithm using 0{nf 
logn) processors on the EREW PRAM to solve the SSSP problem for per- 
mutation graphs. Recently, Chao et al. [^, using 0{n/\ogn) EREW PRAM 
processors, presented an O(logn) preprocessing parallel algorithm to build 0{n) 
space data structure for the APSPQ problem on permutation graphs such that 
each query can be answered in constant time. In |^, using 0{n/ logn) CREW 
PRAM processors, Chen et al. proposed an O(logn) preprocessing parallel al- 
gorithm to build 0{n) space data structure for the APSPQ problem on interval 
and circular-arc graphs such that each query can be answered in constant time. 
For trapezoid graphs, Liang |H] designed a linear time sequential algorithm for 
the SSSP problem. According to our knowledge, there was no literature propo- 
sing any parallel algorithm for the SSSP problem on trapezoid graphs. Although 
there are efficient parallel algorithms for the APSPQ problem on permutation 
and interval graphs, there was no efficient sequential algorithm for this problem 
on trapezoid graphs. 

Given a graph G = {V, E) and a, set U C V, a Steiner set for U in G is a 
set S C V\U of vertices, such that SUU induced a connected subgraph. The 
minimum cardinality Steiner set (MCSS) problem is the problem of finding a 
Steiner set of the smallest cardinality. Liang |S] proposed a linear time sequential 
algorithm for the MCSS problem on trapezoid graphs. 

In this paper, we consider parallel algorithms for shortest paths and related 
problems on trapezoid graphs under the CREW PRAM model. Given a trape- 
zoid graph with n vertices, we present parallel algorithms solving the following 
problems: For the SSSP problem, the algorithm runs in O(logn) time using 
0(n) processors and space. For the APSPQ problem, after spending O(logn) 
preprocessing time using O(nlogn) space and 0(n) processors, the algorithm 
can answer the query in O(logJ) time using one processor. Here 5 denotes the 
distance between two queried vertices. For the MCSS problem, the algorithm 
runs in O(logn) time using 0{n) processors and space. 

We also extend our results to the generalized trapezoid graphs. The SSSP 
problem and the MCSS problem on d-trapezoid graphs and circular d-trapezoid 
graphs can both be solved in O(lognlogd) time using 0{nd) space and 0{d‘^nj 
log d) processors. The APSPQ problem on d-trapezoid graphs and circular d- 
trapezoid graphs can be answered in O(dlogd) time using one processor after 
spending O(lognlogd) preprocessing time using O(ndlogn) space and 0(d^n/ 
logd) processors. 

All of our algorithms use only simple techniques such as the parallel prefix 
and suffix computations [T]. Therefore, they are easy to implement. The rest of 
this paper is organized as follows. Section |5] establishes basic notations and some 
interesting properties of trapezoid graphs. Section 0 gives a parallel algorithm 
for the SSSP problem on trapezoid graphs. Section [3| shows how to preprocess a 
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given trapezoid graph in parallel such that any future shortest path query can be 
answered efficiently. Section |5] presents a parallel algorithm for the MCSS pro- 
blem on trapezoid graphs. In Section]^ we discuss about generalized trapezoid 
graphs. Finally, we conclude our results in Section [7l Due to page limitation, 
our parallel algorithms for the generalized trapezoid graphs are omitted in the 
extended abstract. 

2 Basic Notations and Properties 

There are two parallel lines, top channel and bottom channel respectively. Each 
channel is labeled with consecutive integer values 1,2,3, ... , from left to right. 
A trapezoid ti is defined by four corner points [a^, 6^, Ci, di] such that ai and bi 
are on the top channel and Ci and di are on the bottom channel respectively 
where Oi < bi and Ci < di. The above geometric representation is called the 
trapezoid diagram T. Without loss of generality, we assume that all corner 
points ai,bi,i = l,...,n, on the top channel (similarly, Ci,di on the bottom 
channel) are distinct with coordinates of consecutive integer values 1 , 2 ,..., 2n. 
Trapezoids are labeled in increasing order of their corner points bi’s. A graph 
G = {V, E) is a, trapezoid graph if it can be represented by a trapezoid diagram 
T such that each trapezoid corresponds to a vertex in V and {i,j) G E if and 
only if ti and tj intersects in the trapezoid diagram. Figure [T| shows a trapezoid 
graph with its corresponding trapezoid diagram. In this paper, we assume that 




Fig. 1. A trapezoid graph and its trapezoid diagram. 



the input trapezoid graph is connected. It is easy to see that our algorithms can 
be modified to handle the cases when the input graph is not connected. We also 
assume that its trapezoid diagram is given. 

Suppose p is a corner point of trapezoid ti. For ease of reference, let A(p), 
i?(p), C{p) and D{p) denote a^, bi, Ci and di respectively. If there is no confusion. 
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for ease of reference, for trapezoid u, we also use A(u), B{u), C{u) and D{u) 
to denote its four corner points. For trapezoids u and v, let dis(u, v) denote 
the distance between u and v in the corresponding trapezoid graph. Consider 
trapezoid ti. Among trapezoids which intersect ti, consider their corners. We 
introduce notations to denote trapezoids with the rightmost and the leftmost 
corner on the top channel and the bottom channel respectively. Formally, for 
every trapezoid ti, we define the left farthest-reaching trapezoid-pair, denoted by 
Lfti) = {Lt{U), LB{ti)), and the right farthest-reaching trapezoid-pair, denoted 
by Riti) = {RT{U),RB{ti)), as follows: 

LriU) = tj iff Qj = min{a/|dis(ti, t;) < 1 } 

Lsiti) = tj iff Cj = min{c/|dis(tj,ti) < 1 } 

Rriti) = tj iff bj = max{ 6 i|dis(<i, t;) < 1 } 

RsiU) = tj iff dj = max{d;|dis(ti,t/) < 1 }. 

For example, in Figure[T], Lriti) = t 2 , ^ 5 (^ 4 ) = t 2 , RriU) = t?, RB{ti) = is- 
Let mk and Wk denote the fc-th point on the top channel and bottom channel 
of a trapezoid diagram respectively. According to the following lemma, these 
farthest-reaching trapezoid-pairs can be computed efficiently. 

Lemma 1. In a trapezoid diagram with n vertices, the following statements hold 
for all i, 1 < i < n: 

LriU) = tj iff Qj = min({A(mfc)|fc = ai, ai -\- 1, . . . , 2n} 

U{A(wfc)|fc = Ci,Ci -\- 1, . . . , 2 n}) 

LB{ti) = tj iff Cj = min({C'(mfe)|fc = ai,ai -\- 1, . . . , 2n} 

U{C(wfe)|fc = Ci,Ci -I- 1 , . . . , 2 n}) 

Rt{U) = tj iff bj = Tnax.{{B{mk)\k = 1, 2, . . . , U {B{wk)\k = 1,2, . . . ,dj) 
Rb{U) =tj iff dj = max({L)(?nfc)|fc = 1,2, ... ,bi\ \J {D{wk)\k = l,2,...,dj) 

□ 

Now, we generalize the concept of L{ti) and Rfti). Consider trapezoids which 
ti can reach within k steps. Let R^fti) and R%{ti) denote the trapezoid with 
the rightmost corner on the top channel and the bottom channel respectively. 
Formally, we define = {L\.{ti) , L^{ti)) and R^{ti) = {Rlf{ti) , R%{ti)) as 

follows: 

L!f{ti) = tj iff Qj = min{a/|dis(ti, ti) < k} 

Lsiti) = tj iff Cj = min{c/|dis(ti, t;) < k} 

Rriti) = tj iff bj = max{5;|dis(ti, ti) < k} 

Rsiti) = tj iff dj = max{cti|dis(ti, t/) < k}. 

For example, consider Figure [TJ The set of trapezoids which ti can reach wit- 
hin two steps is {ti\i = 1,2,..., 7}. Therefore, R^{ti) = {tj,tf). By definition, 
RP{u) = {u,u), L^{u) = L{u) and R^{u) = R{u). 
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By definition, B{R^{u)) > B{Rg{u)) and D{R^{u)) > D{R^{u)), it follows 
R^(u) intersects Rg{u). We have the following lemma. 

Lemma 2. For any trapezoid u, R^{u) intersects R%{u) and L\,{u) intersects 

Lq{u). □ 

Now, we describe how to compute dis(u,t;), for trapezoid u and v. With- 
out loss of generality, assume that D{u) < D{v). If u intersects v, obviously, 
dis(u,t;) = 1. Otherwise, if either R}p{u) or Rg{u) intersects v, dis{u,v) = 2. If 
neither R\,{u) nor R^g{u) intersects v, we try R^{u) and R\{u). By this way, we 
will find some k such that neither R^~‘^(u) nor R^^{u) intersects v, yet R^^{u) 
or intersects v. It follows dis(u, u) = k. 

Immediately, we have the following lemma. 

Lemma 3. For any two trapezoids u and v, suppose D{u) < D{v) and dis(M, v) > 
1. Then dis(rt, f) = k, iff 

1. neither R^~^{u) nor R^^{u) intersects v and, 

2. either R^^{u) or R'ff^{u) intersects v. □ 

Suppose that we already know R^{u) = {tj^,tjf). How can we find R^^^{u)l 
We can find it from R{tjf) and R{tjf). It follows the following lemma. 

Lemma 4. For any trapezoid u, 

= m&yi{B{RT{RT{u))),B{RT{RB{u)))} 

D{R)+\u)) = rn&^{D{RB{Rif{u))),D{RB{R'h{u)))} □ 

By induction and Lemma S] we have the following theorem. 

Theorem 1. For any trapezoid u, 

B{R^+\u)) = max{H(i?^y(i?^(u))),H(i?^y(i?^(w)))} 

D{R%+\u)) = rn^^{D{R=^{R^B{u))),D{W^{R%{u)))}. □ 

In other words, R^'^^{u) can be computed from R^ {Rlf{u)) and R^ {R%{u)) 
within constant time. Instead of walking to right only one step once, by this 
theorem, we can walk j steps once. We call such action one long-jump. It will be 
used in the later section. 

The following lemma is also helpful for constructing a shortest path between 
two trapezoids. 

Lemma 5. For any two trapezoids u and v, if D{u) < D(v) and dis(u, r;) = k > 
1, then there exists a path (po,pi, . . . ,pk) such that po = u, pk = v, pi Pi-i, 
Pi = Rt{Pi-i) or Rb{Pi-i) /or z = 1,2, ... ,fc - 1. □ 

When we compute dis(M,r), D{u) < D{v), we start walking from u to v. 
Suppose after I steps, we still can not reach v. It means neither Rfip{u) nor R%{u) 
intersects v, for q = 1, . . . ,1. Now, we consider dis(i?^('u), r) and dis(i?^(u), r). 
We have the following lemma to show that dis(M,r) = I -I- min{dis(i?y(M), r), 
dis(i?^(u),u)}. 

Lemma 6. For any two trapezoids u and v, D{u) < D{v), if dis{u,v) > Z -|- 1, 
then dis{u,v) = Z -I- min{dis(i?^(w), r), dis(i?^(w), r)}. □ 
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3 Single-Source Shortest Path (SSSP) Problem 

In this section, given a trapezoid diagram T and a starting vertex s, we will 
show how to solve the SSSP problem on trapezoid graphs in 0(log n) time using 
0{n) processors on a CREW PRAM. 

Union of all these shortest paths forms a shortest path tree rooted at s. That 
is a breadth first search (BPS) tree starting from s. In this paper, our algorithm 
constructs a BPS tree to represent the solution of the SSSP problem. For ease 
of discussion, we assume that s = It is easy to modify our algorithms to deal 
with the case that s t„. 

Let G denote the corresponding trapezoid graph of T. First, we construct a 
digraph G' from G such that G' is a supergraph oi& BFS tree of G rooted at s. 
We construct G' = (U', E') as follows: let V = V and for i = 1, . . . , n — 1, add 
edges from ti to RriU) and 

By definition of G' and Lemma[^ immediately we have the following lemma. 
Lemma 7 . G' = (V',E') has the following properties. 

1. G' is an directed acyclic graph with out-degree at most 2. 

2. For any vertex u in G' , the shortest path from u to s on G' is also a shortest 

path from u to s on G . □ 

By the above lemma, we have the following theorem immediately. 

Theorem 2. G' is a supergraph of a BFS tree of G rooted at s. □ 

Now, we describe how to find a BFS tree. Since s has the rightmost cor- 
ner point on the top channel, it is not difficult to see that: for any u, u yf s, 
dis(u, s) = 1 if and only if Rt{u) = s. We now consider the cases that dis(u, s) > 
1. At j-th iteration, our algorithm performs one long-jump and walk 2* steps. 
Our algorithm uses the pointer jumping technique and spends 0(log n) parallel 
steps to calculate the distance of each trapezoid to s. We assign each trape- 
zoid a dedicated processor. Suppose that after (j — l)-th iteration, dis(u, s) is 
not decided yet. At the j-th iteration, the corresponding processor will check if 
dis^R"^ (u),s) or dis(i?^ (u), s) is found. If one of them is found, dis(u, s) will 
be decided by the equation in Theorem[l| Otherwise, Rf\u) will be computed. 
That is, in each parallel step, each processor either has decided the distance of 
the trapezoid in question or the processor doubles the distance of the farthest- 
reaching tuple for the trapezoid whose distance then will be decided for later 
step. 

In Figure E] we show our parallel algorithm for the SSSP problem. 

In the above algorithm, at the beginning, for every trapezoid u, u ^ s, we 
compute R{u). At the i-th iteration, either we decide dis(it, s) or we compute 
R^\u). Note that at this moment R^' {R^ (u)) and R^' {Rg (u)) are al- 

ready computed at the previous iteration. In other words, if we can not decide 
dis(u, s), we double the distance to probe s. By this way, to compute dis(rt, s), it 
needs to compute (it), for f = I, . . . , [log(dis(rt, s))J . We call these trapezoid- 
pairs powered- distance trapezoid-pairs of u. 
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Algorithm SSSP 

Input: A trapezoid diagram with n trapezoids. 

Output: The shortest path tree rooted at s = 

Step 0. F[i] = oo, for i = 1, . . . , n — 1. 

*** Finally, F[i] will store dis(ti,s). 

Step 1. Compute R{ti), for i = 1, . . . ,n. 

For each trapezoid U, U ^ s, ii Rt{U) = s, then F[i] = 1. 

Step 2. For i = 1 to [logn], for each trapezoid tj ^ s and F[j] = oo 
if F[i?|n (tj)] ^ oo or V[R% (tj)] ^ oo 
F[j] = + mm{F[Rt\tj)],F[Rt\tj)]} 

else 

compute Rt (tj ) and R% (tj ) from 
R^"~\Rt~\h)) and R^"~\RT\tj)) 
endjf 

Step 3. For each trapezoid ti, compare F[Rt(U)] and F[RB(ti)]. 

Choose Rt(U) or Rb(U) as its parent 

in the BFS tree such that its distance to s is shorter. 

End of SSSP 



Fig. 2. A parallel algorithm for the single-source shortest path problem on trapezoid 
graphs. 



After Step 2 is performed, all distances to s have been computed. By Theo- 
remEl for every u, among Rt{u) and Rb{u), we can choose the one which is 
nearer to s as the parent of u in the BFS tree. Therefore, Algorithm SSSP 
computes the single-source shortest path tree correctly. 

Consider the complexity of the above algorithm now. 

Step 1. By Lemma m all right farthest-reaching trapezoid-pairs can be com- 
puted by utilizing the parallel prefix or suffix computations [Tj in O(logn) time 
using 0(n/ log n) processors. 

Step 2. It is not difficult to see that this step takes O(logn) time using 0(n) 
processors. Note that at the i-th iteration, for trapezoid u, R^ (u) will not be 
used anymore. It follows we can use 0(n) space to store information we need. 

Step 3. It is easy to perform this step in constant time using 0(n) processors. 

Therefore, we have the following theorem. 

Theorem 3. The single-source shortest path problem on trapezoid graphs can 
be solved in O(logn) time using 0(n) space and processors under the CREW 
PRAM model. □ 



4 All-Pair Shortest Path Query (APSPQ) Problem 

In this section, we will present our parallel algorithm for the APSPQ problem 
on trapezoid graphs. 
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Consider the preprocessing stage now. We modify Algorithm SSSP to get the 
preprocessing algorithm. First, we construct the BFS tree rooted at Instead 
of only keeping constant number of powered-distance trapezoid-pairs for each 
trapezoid, we keep all powered-distance trapezoid-pairs. It takes O(logn) space 
to store these information for one trapezoid. So, it costs 0(n log n) space for all 
trapezoids. 

For any trapezoid u in T, let lev (it) denote the level of u in the shortest path 
tree rooted at f„. Note that lev(it) is equal to the distance between u and tn- 
For the query stage, suppose the queried trapezoids are u and v. Without loss 
of generality, we assume that lev(it) > lev(i)). 

The following lemma and theorem show important properties for us to com- 
pute dis(it, v). 

Lemma 8. For trapezoid u and v, u^v, i/lev(u) = lev(ii), then dis(u,i>) < 2. 

□ 



Theorem 4. For trapezoid u and v, if lev(it) — lev(i!) = fc > 0, then k < 
dis(w, v) < k + 2. □ 

By TheoremlH there are only three candidates of dis(it, v): k, k+1 and k + 2. 
At the query phase, we start from k. Consider If one of its trapezoid 

intersects it, by LemmaEl dis(i), it) = k. Otherwise, we try fc -I- 1 and then k + 2. 

We compute using powered-distance trapezoid-pairs. Consider the 

binary representation of A: — I. It needs only |"log 2 (fc — 1)] bits to represent 
k — 1. Therefore, following the powered-distance trapezoid-pairs, we walk from 
It toward v and within |"log 2 (A: — 1)] times long-jumps, we can get R^~^{u). 
For example, suppose k — 1 = 656. Since 656 = 2® -|- 2"^ -|- 2^, we can com- 
pute i?®®®(ii) from powered-distance trapezoid-pairs R^ (u), R^ (u) and R^ (u). 
By Theorem [TJ it takes only constant time to perform one long-jump. Since 
all powered-distance trapezoid-pairs already computed and stored during the 
preprocessing stage, we have the following theorem. Let 6 denote the distance 
between two queried vertices in the rest of this paper. 

Theorem 5. The preprocessing stage for the all-pair shortest path query pro- 
blem on trapezoid graphs can be solved in O(logn) time and 0(n log n) space 
using 0(n) processors under the CREW PRAM model. Further, a query can be 
answered in 0(log 5) time using one processor. □ 

5 Minimum Cardinality Steiner Set (MCSS) Problem 

The MCSS problem aims to find the minimum number of vertices to connect a 
set of target vertices. Liang [S] presented a linear time sequential algorithm for 
the MCSS problem on trapezoid graphs. His algorithm consists of the following 
three phase. Readers who are interested in the details should refer to the original 
paper. 
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Phase 1. Find the connected components among target trapezoids. Treat 
each component as a new target trapezoid. Let T' = ■ ■ ■ , tj} be the set of 

new target vertices with B{t{) < ^(^ 2 ) . . • < 

Phase 2. For each non-target trapezoid, find the largest indexed new trape- 
zoid intersects it. Merge this new target vertex into it. 

Phase 3. The new trapezoid graph G consists of new target trapezoids and 
new non-target trapezoids. Find a shortest path from to on G. 

Now, we describe how to perform the three phases in parallel. 

Phase 1. Consider the graph induced by target trapezoids. We observe that 
any two target trapezoids u and v are in the same connected component if and 
only if R^{u) = K^{v). We can easily modify Algorithm SSSP to compute these 
trapezoid-pairs in O(logn) time using 0(n) processors. 

Phase 2. Recall Lemma [U Similar to computation of farthest-reaching tra- 
pezoid -pair, for each non-target trapezoid, we can find the largest indexed new 
trapezoid intersecting it using only parallel prefix and suffix computations [I]. 
This phase can be done in O(logn) using 0(n/ log n) processors. 

Phase 3. Using Algorithm SSSP, we find the shortest path tree rooted at 
on G. It is easy to report the shortest path from ti to on this tree. It follows 
this phase can be done in 0(log n) using (n) processors. 

Therefore, we have the following theorem. 

Theorem 6. The minimum cardinality Steiner set problem on trapezoid graphs 
can be solved in O(logn) time using 0(n) space and processors under the CREW 
PRAM model. □ 



6 Shortest Paths on Generalized Trapezoid Graphs 



Along with the direction that generalizes interval graphs and permutation gra- 
phs to trapezoid graphs, researchers are now trying to generalize the class of 
trapezoid graphs. 

Flotow introduces the class of d-trapezoid graphs that are the intersection 
graphs of d-trapezoids, where a d-trapezoid is defined by d intervals on d parallel 
lines. Note that the 1-trapezoid graph is exactly the class of interval graphs and 
the 2-trapezoid graph is exactly the class of trapezoid graphs. Kratsch |Z] defines 
circular trapezoid graphs as the intersection graphs of circular trapezoid. Here a 
circular trapezoid is a generalized trapezoid between two concentric circles; two 
parallel lines of the circular trapezoid are circular arcs of each of the two circles, 
and the two other lines of the circular trapezoid are spiral segments. They also 
extends circular trapezoid graphs into d > 2 concentric circles; the generalized 
class of graphs is so called circular d-trapezoid graphs. In our full paper, we also 
show that the previous discussions about shortest paths and related problems 
can be extended to some generalized versions of trapezoid graphs including d- 
trapezoid graphs and circular d-trapezoid graphs. Due to page limitation, these 
algorithms are omitted in the extended abstract. 
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7 Conclusion 

In this paper, we consider parallel algorithms for shortest paths and related 
problems on trapezoid graphs under the CREW PRAM model. Given a trape- 
zoid graph with n vertices, we present parallel algorithms solving the following 
problems: For the single-source shortest path problem, the algorithm runs in 
O(logn) time using 0{n) processors and space. For the all-pair shortest path 
query problem, after spending 0(log n) preprocessing time using 0{n log n) space 
and 0{n) processors, the algorithm can answer the query in 0(logi5) time using 
one processor. Here 6 denotes the distance between two queried vertices. For the 
minimum cardinality Steiner set problem, the algorithm runs in 0(log n) time 
using 0(n) processors and space. 

We also extend our results to d-trapezoid graphs and circular d-trapezoid 
graphs. The class of (circular) trapezoid graphs is exactly the class of (circular) 
2-trapezoid graphs. We have the following results. The single-source shortest 
path problem and the minimum cardinality Steiner set problem on d-trapezoid 
graphs and circular d-trapezoid graphs can both be solved in O(lognlogd) time 
using 0{nd) space and 0{(fn/\ogd) processors. The all-pair shortest path query 
problem on d-trapezoid graphs and circular d-trapezoid graphs can be answered 
in O(dlogd) time using one processor after spending O(lognlogd) preproces- 
sing time using 0(nd log n) space and 0(d^n/ log d) processors. It would be in- 
teresting to design parallel algorithms using fewer processors for these problems. 
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Abstract. Clustering and classification problems arise in a wide range 
of application settings from clustering documents, placing centers in net- 
works, to image processing, biometric analysis, language modeling and 
the categorization of hypertext documents. 

The applications mentioned above give rise to a number of related al- 
gorithms problems, each of which are NP-complete. Approximation al- 
gorithms provide a framework to develop algorithms for such problems 
that have provable performance guarantees. In this talk we shall survey 
some of the general techniques, and recent developments in approxima- 
tion algorithms for these problems. 
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Abstract. How many people can hide in a given terrain, without any 
two of them seeing each other? We are interested in finding the precise 
number and an optimal placement of people to be hidden, given a terrain 
with n vertices. In this paper, we show that this is not at all easy: The 
problem of placing a maximum number of hiding people is almost as 
hard to approximate as the Maximum Clique problem, i.e., it cannot be 
approximated by any polynomial-time algorithm with an approximation 
ratio of n" for some e > 0, unless P = NP. This is already true for a 
simple polygon with holes (instead of a terrain) . If we do not allow holes 
in the polygon, we show that there is a constant e > 0 such that the 
problem cannot be approximated with an approximation ratio of 1 -|- e. 



1 Introduction and Problem Definition 

While many of the traditional art gallery problems such as Vertex Guard 
and Point Guard deal with the problem of guarding a given polygon with 
a minimum number of guards, the problem of hiding a maximum number of 
objects from each other in a given polygon is intellectually appealing as well. 
When we let the problem instance be a terrain rather than a polygon, we obtain 
the following background, which is the practical motivation for the theoretical 
study of our problem: A real estate agency owns a large, uninhabited piece of 
land in a beautiful area. The agency plans to sell the land in individual pieces 
to people who would like to have a cabin in the wilderness, which to them 
means that they do not see any signs of human civilization from their cabins. 
Specifically, they do not want to see any other cabins. The real estate agency, in 
order to maximize profit, wants to sell as many pieces of land as possible. 

In an abstract version of the problem we are given a terrain which represents 
the uninhabited piece of land that the real estate agency owns. A terrain T is a 
two-dimensional surface in three-dimensional space, represented as a finite set of 
vertices in the plane, together with a triangulation of their planar convex hull, 
and a height value associated with each vertex. By a linear interpolation inbet- 
ween the vertices, this representation defines a bivariate continuous function. 
The corresponding surface in space is also called a 2.5-dimensional terrain. A 
terrain divides three-dimensional space into two subspaces, i.e. a space above and 
a space below the terrain, in the obvious way. In the literature, a terrain is also 
called a triangulated irregular network (TIN), see |S]. The problem now consists 
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of finding a maximum number of lots (of comparatively small size) in the terrain, 
upon which three-dimensional bounding boxes can be positioned that represent 
the cabins such that no two points of two different bounding boxes see each 
other. Two points see each other, if the straight line segment connecting the two 
points does not intersect the space below the terrain. Since the bounding boxes 
that represent the cabins are small compared to the overall size and elevation 
changes in the terrain (assume that we have a mountainous terrain), we may 
consider these bounding boxes to be zero-dimensional, i.e. to be points on the 
terrain. This problem has other potential applications in animated computer- 
games, where a player needs to find and collect or destroy as many objects as 
possible. Not seeing the next object while collecting an object makes the game 
more interesting. We are now ready to formally define the first problem that we 
study: 

Definition 1. The problem Maximum Hidden Set on Terrain asks for a 
set S of maximum cardinality of points on a given terrain T, such that no two 
points in S see each other. 

In a variant of the problem, we introduce the additional restriction that these 
points on the terrain must be vertices of the terrain. 

Definition 2. The problem Maximum Hidden Vertex Set on Terrain asks 
for a set S of maximum cardinality of vertices of a given terrain T, such that 
no two vertices in S see each other. 

In a more abstract variant of the same problem, we are given a simple polygon 
with or without holes instead of a terrain. A simple polygon with holes in the 
plane is given by its ordered sequence of vertices on the outer boundary, together 
with an ordered sequence of vertices for each hole. A simple polygon without 
holes in the plane is simply given by its ordered sequence of vertices on the outer 
boundary. Again, we can impose the additional restriction that the points to be 
hidden from each other must be vertices of the polygon. This yields the following 
four problems. 

Definition 3. The problem Maximum Hidden Set on Polygon with(out) 
Holes asks for a set S of maximum cardinality of points in the interior or on 
the boundary of a given polygon P, such that no two points in S see each other. 

Definition 4. The problem Maximum Hidden Vertex Set on Polygon 
with(out) Holes asks for a set S of maximum cardinality of vertices of a 
given polygon P, such that no two vertices in S see each other. 

Two points in the polygon see each other, if the straight line segment connecting 
the two points does not intersect the exterior (and the holes) of the polygon. 
In this paper, we propose a reduction from Maximum Clique to Maximum 
Hidden Set on Polygon with Holes. The same reduction with minor mo- 
difications will also work for Maximum Hidden Set on Terrain, Maximum 
Hidden Vertex Set on Polygon with Holes, and Maximum Hidden 
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Vertex Set on Terrain. Maximum Clique cannot be approximated by a 
polynomial-time algorithm with a ratio of unless coR = NP and with a 
ratio of unless NP = P for any e > 0, where n is the number of vertices 
in the graph l^- We will show that our reduction is gap-preserving (a technique 
proposed in E), and thus show inapproximability results for all four problems. 
Maximum Clique consists of finding a maximum complete subgraph of a given 
graph G, as usual. 

We also propose a reduction from Maximum 5-Occurrence-2-Satisfiabi- 
LiTY to Maximum Hidden Set on Polygon without Holes, which will also 
work for MAXIMUM Hidden Vertex Set on Polygon without Holes. Ma- 
ximum 5-Occurrence-2-Satisfiability is ylPX-hard, which is equivalent to 
saying that there exists a constant e > 0 such that no polynomial algorithm 
can achieve an approximation ratio of 1 -I- e for Maximum 5-Occurrence-2- 
Satisfiability. See [3] for an introduction to the class APX and for the relati- 
onship between the two classes APX and AlaxSNP, see El for the MaxSNP- 
hardness proof for Maximum 5-Ogcurrence-2-Satisfiability. Please note 
that MaxSNP-ha,rdness implies ^PV-hardness [^. We show that our reduction 
is gap-preserving and thus establish the TPV-hardness of Maximum Hidden 
(Vertex) Set on Polygon without Holes. Maximum 5-Ocgurrence- 
2- Satisfiability consists of finding a truth assignment for the variables of a 
given boolean formula. The formula consists of disjunctive clauses with at most 
two literals and each variable appears in at most 5 literals. The truth assigment 
must satisfy a maximum number of clauses. 

There are various problems that deal with terrains. Quite often, these pro- 
blems have applications in the field of telecommunications, namely in setting up 
communications networks. There are some upper and lower bound results on the 
number of guards needed for several kinds of guards to collectively cover all of 
a given terrain |2] . Very few results on the computational complexity of terrain 
problems are known. The shortest watchtower (from where a terrain can be seen 
in its entirety) can be computed in time 0{n log n) |15J . The problem of finding a 
minimum number of vertices of a terrain such that guards at these vertices see all 
of the terrain is VP-hard and cannot be approximated with an approximation 
ratio that is better than logarithmic in the number of vertices of the terrain. 
Similar results hold for the variation, where guards may only be placed at a 
certain given height above the terrain [^. When we deal with polygons rather 
than terrains, we speak of art gallery or visibility problems. Many results (upper 
and lower bounds, as well as computational complexity results) are known for 
visibility problems. See I10I13I14I for an overview, as well as more recent work 
on the inapproximability of Vertex/Edge/Point Guard on polygons with 
|1] and without holes |S]. 

The problems Maximum Hidden Set on a Polygon without Holes 
and Maximum Hidden Vertex Set on a Polygon without Holes are 
known to be VP-hard m This immediately implies the VP-hardness of the 
corresponding problems for polygons with holes. A quite simple reduction from 
these polygon problems to the terrain problems (as given in Sect.[2D even implies 
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Fig. 1. Example graph and the polygon constructed from it 



the A^P-hardness for the two terrain problems as well. In this paper, we give the 
first inapproximability results for these problems. Our results suggest that these 
problems differ significantly in their approximation properties. 

This paper is organized as follows. In Sect. [2l we propose a reduction from 
Maximum Clique to Maximum Hidden Set on Polygon with Holes. 
We show that our reduction is gap-preserving and obtain our inapproximability 
results for Maximum Hidden (Vertex) Set on a Polygon with Holes. 
We show that our proofs also work for Maximum Hidden (Vertex) Set on 
Terrain with minor modifications in Sect. El In Sect. [3 we show the APX- 
hardness of Maximum Hidden (Vertex) Set on Polygon without Holes. 
Finally, we draw some conclusions in Sect. El 

2 Inapproximability Results for the Problems for 
Polygons with Holes 

Suppose we are given an instance / of Maximum Clique, i.e. an undirected 
graph G = (V,E), where V = vq, ■ ■ ■ ,Vn-i- Let m := \E\. We construct an 
instance I' of Maximum Hidden Set on Polygon with Holes as follows. 
/' consists of a polygon with holes. The polygon is basically a regular 2n-gon 
with holes, but we replace every other vertex by a comb-like structure. Each 
hole is a small triangle designed to block the view of two combs from each other, 
whenever the two vertices, to which the combs correspond, are connected by an 
edge in the graph. Figure [I] shows an example of a graph and the corresponding 
polygon with holes. (Note that only the solid lines are lines of the polygon and 
also note that the combs are not shown in Fig. Ill) 

Let the regular 2n-gon consist of vertices vq,Vq, . . . in counter- 

clockwise order, to indicate that we map each vertex Vi £ V in the graph to 
a vertex Vi in the polygon. We need some notation, first. Let eij denote the 
intersection point of the line segment from v[_i to v[ with the line segment from 
Vi to Vj, as indicated in Fig. El (Note that we make liberal use of the notation 
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index for the vertices, i.e. Vi+i is strictly speaking mod n, accordingly for 
Vi-i.) Let d denote the minimum of the distances of etj from eij+i, where the 
minimum is taken over all i,j = 1, . . . ,n. Let e~^ i^tj) denote the point at di- 
stance I from Cij on the line from v'^_i to u' that is closer to v'^_i (u'). Let rrii 
be the midpoint of the line segment from vertex Vi to and let m' be the 
intersection point of the line from u' to rrii and from to e~^i ^ (see Fig. |2]). 

Finally, let efj denote the intersection point of the line from e~j to m' and the 
line from efj to The detailed construction of these points is shown in Fig. 

m We let the triangle formed by the three vertices efj, e“^, and efj be a hole 
in the polygon iff there exists an edge in G from Vi to Vj. Recall Fig. [T] which 
gives an example. 

We now refine the polygon obtained so far by cutting off a small portion at 
each vertex Vi. For each i G {0, . . . ,n}, we introduce two new vertices Vi^ and 
as indicated in Fig. El Vertex Vi^ is defined as the intersection point of 
the line that is parallel to the line from Vi-\ to Vi and goes through point 
and of the line from Vi to u'. Symmetrically, vertex is defined as the 

intersection point of the line that is parallel to the line from Ui+i to Vi and that 
goes through point and of the line from Vi to We fix — 1 additional 

vertices . . . , on the line segment from Vi^ to for each i as shown 

in Fig. El For a fixed i, the two vertices Vij and Vij+i have equal distance for 
alH G {0, . . . , n^}. Finally, we fix additional vertices Wi^i for / G {0, . . . , n^} 
for each i. Vertex Wi^i is defined as the intersection point of the line from vertex 
v'i_i through Vi^i with the line from vertex vertex u' through Vi^i+i. The polygon 
between two vertices vl_^ and u' is now given by the following ordered sequence 
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Fig. 3. Construction of the comb of Vi 



of vertices: Wi^, u' as indicated in Fig. 

El We call the set of all triangles Vi^i,Wi^i,Vi^i+i for a fixed i and all / £ {0, , n^} 
the comb of Vi. We have the following property of the construction. 

Lemma 1. In any feasible solution S' of the Maximum Hidden Set ON Po- 
lygon WITH Holes instance I' , at most 2n points in S can be placed outside 
the combs. 

Proof In each of the n trapezoids {Ui_i, n', Uj_„2_|_i, (see Figs. [l]and|2]), 
there can be at most one point, which gives n points in total. Moreover, by our 
construction any point p in the trapezoid {u'_ 2 , u', m', (not in the holes) 

can see every point p' in the n-gon {ug, . . . ,v'„} except for points p' in any of 
the holes and (possibly) except for points p' in the triangles 
and {t(, m', ;} (see Fig. [21). Therefore, all points in S' that lie in the n-gon 

■ Wn} must lie in only one of the n polygons m'_i, m', ,, u', 

Obviously, at most n points can be hidden in any one of these polygons. □ 

We have the following observation, which follows directly from the construc- 
tion: 

Observation 1 Any point in the comb of Vi completely sees the comb of vertex 
Vj, if{vi,Vj) is not an edge in the graph. If {vi,Vj) is an edge in the graph, then 
no point in the comb of Vi sees any point in the comb of Vj 

Given a feasible solution S' of the Maximum Hidden Set on Polygon with 
Holes instance we obtain a feasible solution S of the Maximum Clique 
instance I as follows: A vertex Vi G V is in the solution S, iff at least one point 
from S' lies in the comb of Vi. To see that 5 is a feasible solution, assume by 
contradiction that it is not a feasible solution. Then, there exists a pair of vertices 
Vi, Vj £ S with no edge between them. But then, there is by construction no hole 
in the polygon to block the view between the comb of Vi and the comb of Vi. 

We need to show that the construction of I' can be done in polynomial 
time and that a feasible solution can be transformed in polynomial time. There 
are 2n^ -|- 1 vertices in each of the n combs. We have additional n vertices u'. 
There are 2 holes for each edge in the graph and each hole consists of 3 vertices. 
Therefore, the polygon P consists of 6 to -|- 2n^ -|- 2n vertices. It is known in 
computational geometry that the coordinates of intersection points of lines with 
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rational coefficients can be expressed with polynomial length. All of the points 
in our construction are of this type. Therefore, the construction is polynomial. 
The transformation of a feasible solution can obviously be done in polynomial 
time. 

We obtain our inapproximability result by using the technique of gap-preserv- 
ing reductions (as introduced in m), which consists of transforming a promise 
problem into another promise problem. 

Lemma 2. Let OPT denote the size of an optimum solution of the Maximum 
Clique instanee I , let OPT' denote the size of an optimum solution of the 
Maximum Hidden Set on Polygon with Holes instanee I' , and let k < n. 
The following holds: OPT > k OPT' > n'^k 

Proof. If OPT > k, then there exists a clique in I of size k. We obtain a solution 
for I' of size n^k by simply letting the vertices Wig for I G {0, . . . ,n^} be 
in the solution if and only if vertex H is in the clique. The solution thus 
obtained for I' is feasible (see Observation [T|) . □ 

Lemma 3. Let OPT denote the size of an optimum solution of the Maximum 
Clique instance I , let OPT' denote the size of an optimum solution of the 
Maximum Hidden Set on Polygon with Holes instance I' , let k <n, and 
let e > 0. The following holds: OPT < OPT' < -|- 2n 

Proof. We prove the contraposition: OPT' > + 2n OPT > ■ 

Suppose we have a solution of I' with Pfj^+2n points. At most 2n of the points 
in the solution can be outside the combs, because of Lemma[l] Therefore, at least 
points must be in the combs. From the construction of the combs, it is clear 
that at most points can hide in each comb. Therefore, the number of combs 

that contain at least one point from the solution is at least = ^1/2-e ■ 

The transformation of a solution as described above yields a solution of I with 
at least vertices. □ 

Lemmas and 0 and the fact that |/'| < lOn^ allow us to prove our first main 
result, using standard concepts of gap-preserving reductions (see m- The proof 
easily carries over to the vertex restricted version of the problem. 

Theorem 1. Maximum Hidden Set on Polygon with Holes and Maxi- 
mum Hidden Vertex Set on Polygon with Holes cannot be approximated 

by any polynomial time algorithm with an approximation ratio of - — , where 

|/'| is the number of vertices in the polygon, and where 7 > 0, unless NP = P. 

3 Inapproximability Results for the Terrain Problems 

Theorem 2. The problems Maximum Hidden Set on Terrain and Maxi- 
mum Hidden Vertex Set on Terrain cannot be approximated by any poly- 

nomial time algorithm with an approximation ratio of - — , where \I"\ is the 

number of vertices in the terrain, and where 7 > 0, unless NP = P. 
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(a) 




TRUE-leg 



(b) 



FALSE-leg 



Fig. 4. (a) Schematic construction, (b) Variable pattern 



Proof. The proof very closely follows the lines of the proof for the inapproxima- 
bility of Maximum Hidden (Vertex) Set on Polygon with Holes. We use 
the same construction, but given the polygon with holes of instance I' we create 
a terrain (i.e. instance I”) by simply letting all the area outside the polygon 
(including the holes) have height h and by letting the area in the interior have 
height 0. We add four vertices to the terrain by introducing a rectangular bo- 
unding box around the regular 2n-gon. This yields a terrain with vertical walls, 
which can be easily modified to have steep but not vertical walls, as required by 
the definition of a terrain. Finally, we triangulate the terrain. The terrain thus 
obtained looks like a canyon of a type that can be found in the south-west of 
the United States. All proofs work very similar. □ 

4 Inapproximability Results for the Problems for 
Polygons Without Holes 

We reduce Maximum 5-Occurrence-2-Satisfiability to Maximum Hid- 
den Set on Polygon without Holes to prove the APV-hardness of Ma- 
ximum Hidden Set on Polygon without Holes. The same reduction will 
also work for Maximum Hidden Vertex Set on Polygon without Holes 
with minor modifications. Suppose we are given an instance I of Maximum 5- 
Oggurrence-2-Satisfiability, which consists of n variables Xq, . . . , x„_i and 
771 clauses cq, . . . We construct a polygon without holes, i.e. an instance 

/' of Maximum Hidden Set on Polygon without Holes, which consists 
of clause patterns and variable patterns, as shown schematically in Fig. 0] (a). 
The construction uses concepts similar to those used in [Ij. The details of the 
construction are similar to a construction in [^, and will therefore be omitted. It 
is, however, necessary to introduce the variable pattern. We construct a variable 
pattern for each variable Xi as indicated in Fig.0](b). The cone-like feature drawn 
with dashed lines simply helps in the construction and is not part of the polygon 
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Fig. 5. Clause Pattern with cones 

boundary. It represent the link to the clause patterns, as indicated in Fig. 
Each variable pattern consists of a TRUE- and a FALSE-leg. The reduction has 
the following properties: 

Lemma 4. If there exists a truth assignment S to the variables of I that satisfies 
at least (1 — e)m clauses, then there exists a solution S' of I' with liS"! > lOn -|- 
2 m -I- (1 — e)m. 

Proof. If variable Xi is TRUE in S, then we let the vertices /i , . . . , /s and w of 
the TRUE- leg of Xi, as well as the vertices vi,V 2 , V 3 and w of the FALSE-leg of 
Xi be in the solution S'. Vice-versa if Xi is FALSE in S. This gives us lOn points 
in S'. The remaining points for S' are in the clause patterns. Figure 0shows the 
clause pattern for a clause Xi , ~'X^ together with the cones that link the clause 
pattern to the corresponding variable patterns. Remember that these cones are 
not part of the polygon boundary. To understand Fig. assume Xi is assigned 
the value FALSE and Xj is assigned the value TRUE, i.e., the clause Xi,~>Xj is 
not satisfied. Then there is a point in the solution that sits at vertex fk (for some 
k) in the FALSE-leg of Xi and a point that sits at vertex (for some k') in 
the TRUE- leg of Xj. In this case, we can have only two additional points in the 
solution S' at points Q , ® • In the remaining three cases, where the variables 
Xi and Xj are assigned truth values such that the clause is satisfied, we can have 
three additional points in S'' at Q - ® . Therefore, we have 2 points from all 
unsatisfied clauses and 3 points from all satisfied clauses, i.e. 2em -I- 3(1 — e)m 
points that are hidden in the clause patterns. Thus, |S'| > lOn -I- 2m + (1 — e)m, 
as claimed II □ 

Lemma 5. If there exists a solution S' of I' with |S'| > lOn -|- 3m — (e -I- 'y)m, 
then there exists a variable assignment S of I that satisfies at least (1 — e — 'y)m 
clauses. 

Proof. For any solution S', we can assume that in each leg of each variable 
pattern, all points in S' are either in the triangles of vertices /i, • . . , /s and w, 

^ The proofs work accordingly for other types of clauses, such as Xi,Xj. 

^ Note that a point at some vertex fk actnally sees a slightly larger cone than indicated 
in Fig. |5l This problem can be dealt with by making the triangle of fk very small. 
Corresponding methods are used to solve similar problems in and jlj. 
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or in the triangles of vertices vi,V 2 ,V 3 , and w (see Fig. |4](b) for the definition of 
these triangles), since any point in any triangle of /i, . . . , /s sees the triangles of 
Vi,V 2 , completely and any single point in the leg outside these triangles would 
see almost all (at least 3) of these triangles, and we could obtain better solutions 
easily. We transform the solution S' (with |S"| > 10n + 3m — (e + 7 ) 771 ) in such a 
way that it remains feasible and that its size (i.e. the number of hidden points) 
does not decrease. This is done with an enumeration of all possible cases, i.e. 
we show how to transform the solution if there is a point in 3, 4, or 5 of the 
triangles of the points /i , . . . , /s in the TRUE-leg and the FALSE- leg of a variable 
pattern. The transformation is such that at the end, we have for each variable 
pattern the six points at /i , . . . , /s , and w from one leg in the solution and the 

4 points vi,V 2 ,vs, and w from the other leg. Thus, we can easily obtain a truth 

assignment for the variables by letting variable Xi be TRUE iff the six points 
at /i, • ■ • , /s, and w from the TRUE-leg are in the solution. The transformed 
solution S' consists of at least lOn -I- 3m — (e -I- 7 )m points, lOn of which lie in 
the variable patterns. At most 3 points can lie in each clause pattern. If 3 points 
lie in a clause pattern, then this clause is satisfied. Therefore, if 2 points lie in 
each clause pattern, there are still at least (1 — e — 7 )m additional points in S'. 
These must lie in clause patterns as well. Therefore, at least (1 — e — 7 )m clauses 
are satisfied. □ 

Lemmas 12 and 1^ show how to transform two promise problems into one another. 
By using standard concepts of gap-preserving reductions and by introducing 
some minor modification for the vertex-restricted problem, we obtain: 

Theorem 3. Maximum Hidden (Vertex) Set on Polygon without Ho- 
les is APX-hard, i.e. there exists a constant S > 0 for each of the two pro- 
blems such that no polynomial time approximation algorithm for the problem 
can achieve an approximation ratio of 1 S, unless P = NP. 

5 Conclusion 

We have shown that the problems Maximum Hidden (Vertex) Set on Poly- 
gon WITH Holes and Maximum Hidden (Vertex) Set on Terrain are al- 
most as hard to approximate as Maximum Clique. We could prove for all these 
problems an inapproximability ratio of but under the assumption 

that coR NP, using the stronger inapproximability result for Maximum Cli- 
que from [7j. Furthermore, we have shown that Maximum Hidden (Vertex) 
Set on Polygon without Holes is APA-hard. Note that an approximation 
algorithm for all considered problems that simply returns a single vertex achie- 
ves an approximation ratio of n. Note that our proofs can easily be modified to 
work as well for polygons or terrains, where no three vertices are allowed to be 
collinear. We have classified the problems Maximum Hidden (Vertex) Set 
ON Polygon with Holes and Maximum Hidden (Vertex) Set on Ter- 
rain to belong to the class of problems inapproximable with an approximation 
ratio of n" for some e > 0, as defined in |T]. The APA-hardness results for 



194 



S. Eidenbenz 



the problems for polygons without holes, however, do not precisely characterize 
the approximability characteristics of these problem. The gap between the best 
(known) achievable approximation ratio (which is n) and the best inapproxima- 
bility ratio is still very large for these problems and should be closed in future 
research. As for other future work, we plan to consider several variations of the 
problems presented. For example, we plan to try to hide non-zero-dimensional 
objects. 
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Abstract. We introduce an online relocation problem on a graph, in 
which a player who walks around the nodes makes decisions on whether 
to relocate mobile resources, while not knowing the future requests. We 
call it Carrying Umbrellas. This paper gives a necessary and sufficient 
condition under which a competitive algorithm exists and describes an 
optimal algorithm and anaylzes its competitve ratio. We also extend this 
problem to the case of digraphs. 



1 Introduction 

“To carry an umbrella or not?” This is an everyday dilemma. To illustrate some 
of our concepts, we describe a detailed scenario. Picture a person who walks 
around N places. At each place, he is told where to go next and then must go 
there. As a usual person, he dislikes being caught in the rain without umbrellas. 
Since today’s weather forecast is correct, he knows if it will be rainy today. 
However, he does not know the future destinations and weathers, which might 
be controlled by a malicious adversary. We say a person is safe if he never 
get wet. There is a trivial safe strategy: 'Always carry an umbrella'. However, 
carrying an umbrella in sunny days is annoying and stupid. As an alternative, 
he placed several umbrellas in advance and thinks about an efficient strategy; 
he hopes, through some cleverness, to minimize the number of days he carries 
umbrellas. By adopting the competitive analysis [8], we say a person’s strategy 
is competitive if he carries umbrellas in a small number of days (for details, see 
Subsection 1.1). The following questions are immediate: 1. What is the minimum 
number of umbrellas with which the person can be safe and competitive? (For 
example, is placing one umbrella per place sufficient?) 2. What is a safe and 
competitive strategy? We formally define the problem next. 



1.1 A Game 

Let G = (V,E) be a simple undirected graph with N vertices and M edges. 
(An extension to digraphs is straightforward and considered only in Section 5.) 
An integer u(v) is associated with each vertex v €V, indicating the number of 

* Supported in part by KOSEF grant 98-0102-07-01-3. 
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umbrellas placed on the vertex v. We use u{G) to represent the total number of 
umbrellas in G. Consider an on-line game between a player and the adversary, 
assuming that u{G) is fixed in advance. 

Game Carrying-Umbrellas(G, m(G)) 

1. Initialization 

Player determines the initial configuration of m(G) umbrellas; 

Adversary chooses the initial position ssy of Player; 

2. for i = 1 to L do 

(a) Adversary specifies (v, w), where w is a boolean value and 

is adjacent to the current vertex of Player; 

(b) Player goes to v; If w = l, he must carry at least one umbrella; 

As an initialization, it is assumed that the player first determines the initial con- 
figuration of m(G) umbrellas and later the adversary chooses the initial position 
s S y of the player, although our results described in the sequel also hold even 
if the initialization is done in reverse order. A play consists of L phases, where 
L is determined by the adversary and unknown to the player. In each phase, 
the adversary gives a request {v, w), where v is adjacent to the current vertex 
of the player and w; is a boolean value indicating the weather; w = 1 indicates 
that it is rainy, and w = 0 sunny. For this request {v, w), the player must go to 
the specified vertex v and decide whether to carry umbrellas or not. If w = l, he 
must carry at least one umbrella; Only in sunny days, he may or may not carry 
umbrellas. Notice that the player can carry an arbitrary number of umbrellas, if 
available. 

The player loses if he cannot find any umbrella at the current vertex in a 
rainy phase. A strategy of the player is said to be safe if it is guaranteed that 
whenever w = 1 the player carries at least one umbrella. Moreover, a strategy of 
the player should be efficient. 

As a measure of efficiency, we adopt the competitive ratio [8], which has been 
widely used in analyzing the performance of online algorithms. Let a = <Ji ■ ■ ■ ar 
be a request sequence of the adversary, where Ui = {vi,Wi) is the request in 
Ath phase. The cost of a strategy A for a, written Gj\{a), is defined as the 
number of phases in which the player carries umbrellas by the strategy A. Then, 
the competitive ratio of the strategy A is where Opt is the optimal off- 

line strategy. (Since Opt knows the entire a in advance, it pays the minimum 
cost. However, Opt cannot be implemented by any player and is used only for 
comparison.) A strategy whose competitive ratio is bounded by c is termed c- 
competitive; it is simply said to be competitive, if it is c-competitive for some 
bounded c irrespective of the length of a. The player wins, if he has a safe and 
competitive strategy; he loses, otherwise. 

Not surprisingly, whether the player has a winning strategy depends on the 
number of umbrellas. In this paper, we are interested in it. 

Definition 1 For a graph G, we define u*{G) to be the minimum number of 
umbrellas with which the player has a winning strategy in G. 
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1.2 Summary of Our Results 

Let G = (V, E) be a simple graph (having no parallel edges) with N vertices 
and M edges. We show that u*{G) = M+l. That is, no strategy of the player is 
safe and competitive if u{G) <M+1 (Section 3) and there exists a competitive 
strategy of the player if u{G) > M+1 (Section 4). The competitive ratio of our 
strategy is 6(G), the number of vertices in the largest biconnected component 
in G. Moreover, the upper bound is attained by a ‘weak’ player that carries at 
most one umbrella in a phase, and the competitive ratio 6(G) is optimal for 
some graphs when u{G) = M+1. Finally, we extend this problem to the case 
of digraphs (Section 5). Interestingly, u*{G) of a digraph G seems to be closely 
related with a structure of G. We present a general upper and lower bound on 
u*{G), which is naturally reduced to M + 1 in undirected graphs. 

1.3 Related Works 

Every online problem can be viewed as a game between an online algorithm and 
the adversary 1 713 16 1 5| . In this prospect, Ben-David, et al. [3] introduced request- 
answer game as a general model of online problems. Clearly, the problem of the 
present paper is an example of the request-answer game. Many research works, 
though not directly related with our problem, have studied online problems on 
a graph. Such examples include the fc-server problem |T|, graph coloring |2|4| . 

2 Lower Bound 

This section describes the lower bound on tt*(G). For a simple explanation, we 
first consider a ‘weak’ player that can carry at most one umbrella in a phase, 
and showed that u*{G) > M+1 for the weak player. Later the same bound is 
extended to the general player that can carry multiple umbrellas. 

Theorem 1. Let G be a simple graph (with no parallel edges) with N vertices 
and M edges. Under the constraint that the player can carry at most one umbrella 
in a phase, u*{G)>M+l. 

Proof. Assume that G is connected (otherwise, the number of umbrellas in some 
connected component of G is no larger than the number of edges in it, and our 
proof can be applied to it). We show that if it(G) < M then no strategy of 
the player is safe and competitive in G. Equivalently, it suffices to show that 
any safe strategy of the player is not c-competitive, for any fixed c (< oo). For 
convenience, imagine we are the adversary that would like to defeat the player. 

Let a = aia 2 ■ ■ ■ cfl denote a request sequence that we generate. Recall that 
the starting vertex s of the player is determined by the adversary; We choose an 
arbitrary vertex as s. 

Suppose we were somehow able to make the player lie in s so that u*{s) 
becomes d+1, where d is the degree of s in G. This would tell us that G— {s} 
is ‘deficient’ in umbrellas, that is, m(G— { s}) is strictly less than the number of 
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edges in G— {s}. In the next step, we make the player go into G— {s}. Even if 
the player has carried an umbrella, u(G— {s}) is no larger than the number of 
edges in G — {s}, and the player is in it. Now we use recursion: We make the 
player not safe and competitive in G— {s}. (In fact, some cautions are needed to 
make the entire competitive ratio unbounded.) So our subgoal is to make u{s) 
increase to d+1. 

Let vi,---,Vd be the adjacent vertices of s and let C\, - ■ ■ ,Cd denote the 
maximal connected components in G — {s} such that Ci includes Vi. Of course, 
Ci happen to be equal to Cj even when i^j. Let u{Ck) denote the number of 
umbrellas placed in Ck- Observe that if G— {s} is deficient in umbrellas, then 
some component Ck also is. 

We begin with explaining how to increase m(s). Suppose that at the start 
of (2z— l)-th phase, the player is located at s and u{s) = j. In subsequent two 
phases, the adversary makes the player go to Vj and return to s. That is, a2i-i = 
{vj, sunny) and Cf2i = (s,W2i)- The 2i-th weather W2i is determined according 
to the player’s decision in (2z — l)-th phase; If the player carried an umbrella 
in (2z— l)-th phase, W2i is set to sunny, Otherwise rainy. The strategy of the 
adversary is summarized below. The first and second if parts are the generation 
of <J2i-i and (T2i) and the last if part is to check whether deficient Ck exists. If 
it exists, the recursive procedure Adversary(Gfe, is called. Remember that if 
u{s) becomes d+1, deficient Ck exists and the recursive procedure starts. 

Adversary(G, s) 
i = 1 ; 

while TRUE do 
if m(s) = j then 

CT 2 *-i = {vj, sunny); 

if (the player carried an umbrella for a2i~i) then 
CT 2 * = (s, sunny); 

else 

<72% = {s, rainy); 

if (there is Ck such that u{Ck) < \E{Ck)\) then 
<72%+i = {vk, sunny); 

Adversary(Gfc,Ufc); 

end-while 

To see why this request sequence makes u{s) increase, we define a weighted 
sum (p of the umbrellas placed in vertex s and its neighbors, where j umbrellas 
in s weighs 0.5, 1.5, • • • , j— 0.5, respectively and each umbrella in vertex Vi weighs 
i. Specifically, 

d u(s) 

= u{v^) + ^(j - 0.5) 

i=i j=i 

Let (Pk denote <P at the end of fc-th phase. We are interested in the change 
between <p2k-2 and d>2k- Suppose that u{s) is j at the end of (2/c — 2)-th phase. 
Depending on the decision of the player, there are four cases to consider. 
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Case A. If the player carried an umbrella throughout (2fc— l)-th and 2fc-th phase, 
we have ^ 2 k- 2 =^ 2 k because the number of umbrellas is unchanged. Howe- 
ver, the weather must be sunny from Adversary (G, s). Therefore, the player 
did an useless carrying. 

Case B. If the player carried an umbrella from s to Vj only, <p 2 k = ^^ 2 fc -2 + 0.5. 

This is because the umbrella moved had weight j — 0.5 at s and j at Vj. 
Case C. If the player carried an umbrella from Vj to s only, <l> 2 k = ^^ 2 fc -2 + 0-5. 

This is because the weight of the umbrella moved changes from j to j-l-0.5. 
Case D. The remaining case is that the player never carried an umbrella. This is 
impossible by Adversary(G, s), because if the player didn’t carry an umbrella 
in (2/c— l)-th phase W 2 k is set to rainy. 

In summary, over two phases 2fc— I and 2k, <P increases or is unchanged; If <P is 
unchanged, the player did useless carrying in consecutive sunny days. Note that 
successive useless carryings make the player’s strategy not competitive, because 
Opt would not carry an umbrella over two sunny days. Hence, in order to be 
competitive, the player must sometimes increase <1>. 

How many times can increase before u{s) becomes d-|-l? Recall that once 
u{s) becomes d-fl, some deficient Ck exists and we start the recursive procedures. 
While u{s) < d, each umbrella can have 2d different weights. Moreover, since 
u{G) is less than or equal to the number of edges in G, can increase at most 
2dx ^ <N^ phases before u{s) becomes d+1. Thus, <l> can increase at most 
phases, before the recursive procedure is called. 

In Adversary (Gfc, Ufc) , we recursively define a request sequence and weight 
sum <P'. Thus, in Ck, d>' can increase in at most (IV— 1)^ phases, before the player 
moves into some its deficient subgraph. In all recursive procedures, its weight 
sum can increase in at most 1)^-1- • -+2 < phases. In the limiting case 

that N = 2, the player always does useless carrying. This means that the player 
carries an umbrella except at most phases, and that for the same input. Opt 
can carry an umbrella in at most phases. 

Finally, we choose the length L of input sequence cr sufficiently large. Spe- 
cifically, we let L> {c+ 1) ■ (IV^). Then, the cost of the player’s strategy A 
is 

>L-N'^>c- (N^) 

And, the cost of Opt is 

Copti.^) < 

Therefore, the competitive ratio is, for an arbitrary c(<oo), 

Coptia) ^ 

completing the proof. □ 

With small modification, we can show the following theorem but details are 
omitted in this abstract. 

Theorem 2. If G is a graph with M edges (but no parallel ones), u*{G) >M+1. 
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3 Upper Bound 

In this section, we show u*{G) is no larger than M + 1 . It suffices to show that 
if u{G) = M+l then the player has a safe and competitive strategy. Throughout 
this section, we assume that u{G) = M+l. To show this theorem, we only consider 
the weak player that carries at most one umbrella in a phase. Imagine that we 
are the player that should be safe and competitive against the adversary. 

The heart of our strategy is how to decide whether to carry an umbrella in 
sunny days. To help this decision, we maintain two things: labels of umbrellas and 
a rooted subtree of G. First, let us explain the labels, recalling that u{G) = M+l. 
M umbrellas are labeled as 7(e) for each edge eGE, and one remaining umbrella 
is labeled as Current. Under these constraints, labels are updated during the 
game. In addition to the labels, we also maintain a dynamic rooted subtree S of 
G, called skeleton. 

— 5 is a subtree of G whose root is the current position of the player. 

— The root of S has the umbrella labeled as Current. 

— For an edge e G S, the child- vertex of e has the umbrella labeled as 7(e). 

— For an edge e ^ S, the umbrella labeled as 7(e) lies in one endpoint of e. 

Note that all umbrellas labeled as 7(e) is placed on one endpoint of e. In an 
example of Figure [TJi,, current skeleton S is enclosed by a dotted line and every 
edge in S is directed towards the child, indicating the location of its umbrella. 




Fig. 1. Example of a skeleton (enclosed in a dotted line). 



S is initialized as follows: The player first chooses an arbitrary spanning tree 
T and places an umbrella in each vertex of T, with their labels undetermined. 
Then, the number of umbrellas used in T is N. For an edge e not in T, we place 
an umbrella on any endpoint of it; This umbrella is labeled as 7(e). Then, the 
number of umbrellas labeled equals the number of edges not in T that is M—N + 1 
and so the total number of umbrellas used is M-l-1. After the adversary chooses 
the start vertex s, the umbrellas in T are labeled. The umbrella in s is labeled 
as Current, and the umbrella in v (yf s) is labeled as 7(e), where e connects v 
with its parent in T. 

Now, we describe the strategy of the player. Suppose that currently, the 
player is at Vi-i and the current request is (uj, Wi). If the weather is rainy, the 
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player has no choice; it has to move to Vi with carrying the umbrella Current. 
In this case, the player resets 5 as a single vertex It is easily seen that 

the new skeleton satisfies the invariants, because only the umbrella Current is 
moved. The more difficult case is when the weather is sunny. 

Suppose that Wi = Q, i.e., sunny. Depending on whether Vi belongs to S or 
not, there are two cases. If Vi belongs to S (Figure [ 2 ^), then the player moves 
to Vi without an umbrella. Though no umbrellas are moved, labels must be 
changed. Let ei, 62, • • • , e^. be the edges encountered when we traverse from Vi-i 
to Vi in S. The umbrella Current is relabeled as 7(61) and the umbrella ■j{ek) is 
relabeled as 7(efe+i) for l<fc<r — 1 and finally, the umbrella 7(6^) is relabeled 
as the new Current. Observe that new skeleton S still satisfies the invariants. 

If Vi does not belong to S (Figure Et), the edge e = (vi-i,Vi) does not lie in 
S. Remember that the umbrella 7(e) was at Vi-i or Vi from the invariants. If the 
umbrella 7(e) was placed at Ui_i, the player moves to Vi carrying the umbrella 
Current, and adds the vertex Vi and the edge (vi-i,Vi) to S. If the umbrella 
7(e) was placed at Vi, the player moves to Vi without carrying an umbrella and 
adds the vertex Vi and the edge (vi-i,Vi) to S and additionally, swaps the labels 
of Current and 7(e). In both cases, it is easy to see that S still satisfies the 
invariants (see Figure EId). 





Fig. 2 . After Fig.|Tl {vi, sunny) is given, (a) Vi is in S. (b) Vi is not in S. 



Theorem 3 . Let G be a graph with N nodes and M edges. If u{G) > M + 1, 
the player’s strategy is safe and b{G)~ competitive, where b{G) is the number of 
vertices in the largest biconnected component in G. 

Proof. First, the player’s strategy is safe from the invariant that the player 
always has the umbrella Current. 

Next, we show that the strategy of the player is TV-competitive. Let us divide 
the request sequence into a number of stages, each of which contains exactly one 
‘‘rainy’ and starts with ’rainy’. Remember that the player resets 5 to a single 
vertex at the start of every stage. The cost of Player(G) in a stage is at most 
N, because the cost 1 is paid only when S is set to a single vertex or becomes 
larger. The cost of Opt in a stage is at least one, because each stage starts with 
‘rainy’. Therefore, the competitive ratio is at most N. The competitive ratio can 
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be refined to 6(G) by observing that for one rainy phase from u to v, we only 
have to reset S only in the biconnected component containing u and v. □ 

Thus we have that u*{G) = M+l. Unfortunately, the above competitive ratio is 
best possible for some graphs when it(G) = M+l. 

Theorem 4. For an N-node ring G with u{G) = M+1, the competitive ratio of 
any strategy is at least N . 

Proof. Omitted. 

4 Directed Graphs 

In this section, we extend the result of the previous section to the case of di- 
graphs. Our motivation is as follows: Let G = (U(G),2l(G)) be a digraph. (For 
convenience, it is assumed that G is strongly connected, that is, there is a path 
from any vertex to any other vertex.) In the special case that G is a symmetric 
digraph (obtained by replacing each edge of an undirected graph by two direc- 
ted arcs with reverse directions), u*{G) = -|- 1 by theorems in Section 3 

and 4. Does it hold for every digraph? It is easily seen that the answer is no. 
A counterexample is given in Figure For a graph G made by adding directed 
arcs alternatively to the undirected n-star, we have that u*{G) = M*(n-star), 
irrelevant with the number of added arcs. 




Fig. 3. For this graph G, u*{G) = m* ( 8-star) = 8. Thus u*{G) of a directed graph G 
seems not to be directly determined by |A|, in contrast with undirected graphs. 



This section shows upper and lower bound on u*{G), which are not tight in 
general but reduce to \E\ + 1 for undirected graphs. 

Let G be a digraph. See Figure |4] Gc is made by contracting a path consisting 
of vertices with indegree 1 and outdegree 1. More exactly, it {vi,V 2 • • ■ Vk) be a 
one-way path such that the indegree and outdegree of Ui (2 < f < fc — 1) is 1, 
we replace it by an arc (vi,Vk). This operation is called a contraction and Gc is 
obtained by repeatedly applying to G contractions as long as possible. Let Gc 
be the undirected graph (with no parallel edges) such that V (Gc) = V (Gc) and 
Gc contains an edge (u,w) if and only if Gc contains an arc (u,w) or (w,u). 
Observe that Gc is uniquely defined from Gc- We call Gc the underlying graph 
of Gc. 

Theorems. Let G be a digraph. T/ien tt*(G) < |if(Gc)| -I- |U(G)| — |U(Gc) | -I- 1. 
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Fig. 4. (a) A graph G, (b) its contracted one Gc, and (c) its underlying graph Gc- 



Proof. Suppose that u*{G) = \E{Gc)\ + |U(G)| — |U(Gc)| + 1. We present a com- 
petitive strategy. Among the vertices in G, we call the ones in Gc c-vertices, the 
ones not in Gc p-vertices. Note that the number of p- vertices is |U(G)| — |U(Gc)|. 
Roughly speaking, the player’s strategy decides whether to carry umbrellas or 
not, depending on |A(Gc)| + 1 umbrellas placed on c-vertices 

Let us explain the initialization of u{G) umbrellas. |if(Gc)| + 1 umbrellas are 
placed on c-vertices by applying the initialization of the undirected graphs to 
Gc. Each of |E(G)| — |U(Gc)| umbrellas is placed at each p-vertices. Then each 
p-vertex has one and only one umbrella. 

Consider a phase, say j, in which it is rainy for the first time. Let {vi, V 2 ■ ■ ■ Vk) 
be a one-way path in G where vi and Vk are c-vertices and every Vi (i 1, fc) is 
a p-vertex. Without losing generality, suppose that the player moves from Vi to 
Vi+i at phase j. Then the player carries an umbrella afterwards until it reaches 
Ufe. If i = 1, then the player carries an umrbella from vi to Vk, which corresponds 
to one rainy phase occured in Gc- Otherwise, that is, if 2 < i < fc — 1, the player 
carries an umrbella from Vi to Vk ■ By assuming that vi borrows an umbrella from 
Vi and this umbrella will be returned to Vi when the player goes to Vi again, we 
can imagine as if the player carries an umrbella from vi to Vk- Thus any case 
corresponds to one rainy phase occured in Gc. 

Afterwards, as in Theorem El we construct the skeleton S in Gc and decides 
whether to carry an umbrella or not according to 5* in Gc. Suppose that the player 
lies on a c- vertex vi and the adversary requests to move to V 2 where {vi,V 2 , ■ ■ ■ Vk) 
is a one-way path. If it is rainy, we also follows the routine explained above and 
skeleton becomes only one vertex Vk- Otherwise, consider the strategy of the 
player on Gc and compute if the player carries an umbrella for the request 

’^k) on Gc. If the the palyer carries an umbrella on Gc, he also does in G and 
always carries an umbrella while traveling to Vk ■ Otherwise, the player does not 
carry an umbrella, but if vi borrowed an umbrella from Vi, vi returns it. 

After one rainy phase, the player can construct a full skeleton by carring 
an umbrella whenever visiting a c-vertex not in S. Thus the player’s strategy is 
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|F(G)|-competitive. In fact, it is 6(G)-competitive where b{G) is the size of the 
largest biconnected component in G. □ 

Lower bound can also be obtained for digraphs (but proofs are omitted): For 
a digraph G, let G(, be the undirected graph (with no parallel edges) such that 
V{G'J = V{Gc) and G^ contains an edge {u,w) if and only if Gc contains both 
arcs (rt,u>) and {w,u). Then u*{G) > \E{G'^\ + |I^(G)| — \V{Gc)\ + 1- Thus the 
gap between upper and lower bound on u*{G) is \E{Gc)\ — \E{G'^\. Observe 
that Theorem [His optimal for the graph given in Figure ID 

5 Concluding Remarks 

This paper is the first attempt on an online problem Carrying Umbrellas, and 
many questions remain open. First, finding u*(G) for a digraph G seems to be 
interesting in a graph-theoretical viewpoint. Similarly, the characterization of the 
class of graphs such that u*{G) = |k(G)| is an interesting open problem (among 
undirected graphs, such examples are only trees). Second, we think that the 
problem Carrying Umbrellas can be extended to be of practical use by adopting 
different relocation constraints and considering general resources rather than 
umbrellas. For now, however, it is mainly of theoretical interest. Finally, reducing 
the competitive ratio with more umbrellas is an open problem. 

Acknowledgements grateful to Oh-Heum Kwon for motivating this work. 
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Abstract. We introduce new classes of graphs to investigate networks 
that guarantee constant delays even in the case of multiple edge failures. 
This means the following: as long as two vertices remain connected if 
some edges have failed, then the distance between these vertices in the 
faulty graph is at most a constant factor k times the original distance. 
In this extended abstract, we consider the case where the number of 
edge failures is bounded by a constant 1. These graphs are called {k,i)~ 
self-spanners. We prove that the problem of maximizing £ for a given 
graph when fc > 4 is hxed is AfP-complete, whereas the dual problem of 
minimizing k when £ is fixed is solvable in polynomial time. We show how 
the Cartesian product affects the self-spanner properties of the composed 
graph. As a consequence, several popular network topologies (like grids, 
tori, hypercubes, butterflies, and cube- connected cycles) are investigated 
with respect to their self-spanner properties. 



1 Introduction 

Fault-tolerance represents one of the major concerns in network design, and a 
large amount of research has been devoted to creating fault-tolerant parallel 
architectures. The techniques used in this research can be divided into two ty- 
pes. The first type consists of techniques that do not add redundancy to the 
desired architecture. Instead, these techniques attempt to mask the effects of 
faults by using the healthy part of the architecture to simulate the entire ma- 
chine (e.g., see mm ). The second type consists of techniques that add redun- 
dancy to the desired architecture. These techniques attempt to isolate the faults 
while maintaining the complete desired architecture (e.g., see PUS]). The goal of 
these techniques is to maintain the full performance of the desired architecture 
while minimizing the cost of the redundant components. Other works present 

* Research partially supported by Italian MURST project ‘Teoria dei Grafi e Appli- 
cazioni’, and by the Deutsche Forschungsgemeinschaft under grant Wa 654/10-2 



A. Aggarwal, C. Pandu Rangan (Eds.): ISAAC’99, LNCS 1741, pp. 205-|21^ 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



206 



S. Cicerone, G. Di Stefano, and D. Handke 



algorithms for routing messages around faults, or show how to perform certain 
computations in networks containing faults. 

In this paper we follow a different approach. Starting from the observation 
that networks are usually modeled by graphs, we introduce and characterize 
new classes of graphs that guarantee constant delay factors even when a limited 
number of edges have failed. These graphs are called {k, £)-self-spanners, since 
a strict relationships to the concept of k-spanners m- 

A network modeled as a (A:, £)-self-spanner graph can be characterized as 
follows: If at most i edges have failed, as long as two vertices remain connected, 
the distance between these vertices in the faulty graph is at most k times the 
distance in the non-faulty graph. By fixing the values k and i (called stretch 
factor and fault-tolerance, respectively), we obtain a specific new graph class. 
We are interested in characterizational as well as structural aspects of this class. 

Motivations to this work are from [4] and |8] . In |4] , Cicerone and Di Stefano 
introduce a class of graphs which guarantees constant delays even in the case of 
an unlimited number of vertex failures (but their results do not carry over to 
the dual case of edge failures). In [S], Handke considers the problem of finding 
sparse subgraphs in a given graph such that the distance within this subgraph 
is at most a (fixed) constant times the original distance, even when an edge or 
a vertex in the subgraph fails. 

This extended abstract contains the following results: After a short intro- 
duction into the basic notation, we investigate networks modeled by (k,t)-self- 
spanners. The first main results concern the problems of deciding whether a 
given a graph is a (fc, A)-self-spanner: The problem is A/"7^-complete for the gene- 
ral case where k and £ are part of the input and remains AfP-complete if A; > 4 
is fixed. However, if A; < 3 is fixed, or if £ > 0 is fixed, then there are polynomial 
time algorithms. Thus, only the case where A: = 4 is fixed remains open. 

In the subsequent subsection, we examine how a graph that arises by the 
Cartesian product inherits the self-spanner properties of the underlying graphs. 
We also show strong results especially for small stretch factors and fault-tolerance 
values. 

The last part shows how the new graph classes of (A:, £)-self-spanners fit into 
the context of some popular network topologies. We show for example that mesh- 
like topologies such as grids, tori, and hypercuhes exhibit strong self-spanner pro- 
perties in particular for small fault-tolerance values. Bounded-degree approxima- 
tions of the hypercube such as butterflies and cube- connected cycles, however, 
result in big stretch factors even in the case of small fault-tolerance values. 

We have also considered k-self-spanners, i.e., networks that guarantee con- 
stant delays even in the case of an unlimited number of edge failures, but, due to 
space limitation, in this extended abstract we only summarize results concerning 
(A:, £)-self-spanners. Full details and omitted proofs are included in |^. 

2 Basic Notation 

In this work, we use standard notation for graphs. Let G = {V,E) be a simple, 
unweighted, and undirected graph with n = \V\ vertices and m = \E\ edges. 
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G — e, where e G E{G), is the graph obtained from G by deleting edge e. 
dciu^v) is the distance between two vertices u and v in G, i.e., the length of a 
shortest path. 

An edge is a chord of a cycle G if it connects two non-adjacent vertices of G. 
A cycle G is induced if it does not contain a chord. Pn is the path graph and G„ 
is the induced cycle graph (also called ring), respectively, with n vertices. The 
graph Kn is the complete graph on n vertices, and is also called triangle. 

For a connected graph, an articulation vertex is a vertex the deletion of which 
disconnects the graph. A graph is called hiconnected (or 2-vertex-connected) if 
it has no articulation vertex. It is called l-vertex-connected if there is no subset 
of vertices S of size £ — 1 such that the subgraph of G induced by vertices in 
V\S is disconnected. A graph is £-edge-connected if no deletion of a £ — 1 edges 
disconnects it. An edge e of G is called bridge if G — e is disconnected. 

For any fixed rational fc > 1, a k-spanner of an unweighted graph G is a 
spanning subgraph S' in G such that the distance between every pair of vertices 
in S is at most k times their distance in G. The parameter k is called stretch 
factor. It is clear that an unweighted graphs S is a fc-spanner of G if and only 
if S is a [fcj -spanner of G. Thus, it suffices to consider integer stretch factors. 

Remark 1. P] A subgraph S = {V, E') of a graph G = {V, E) is a fc-spanner if 
and only if ds{u, v) < k, for every edge e = {m, u} G E\E' . 



3 Bounded Delay and Limited Fault-Tolerance 



We are interested in graphs that exhibit the following property: If at most £ 
edges in a graph G fail, then for all pairs of vertices that remain connected a 
distance constraint is fulfilled. Thus, the number of faulty edges is limited to a 
fixed number. See the full paper [SI for results on the case where the number 
of edge failures is unlimited. Observe that we do not care for cases where two 
vertices are separated by the failure of edges since then the definition of distance 
does not apply; see also Remark [2] below. This leads to the following definition: 

Definition! ((fc, £)— self-spanner). 

1. For any fixed real k>l and fixed integer £>Q, a graph G = {V, E) is called 
a (fc, £) -self-spanner if for every subgraph G' = (F, E') ofG with \E'\ > m—£, 

dafu, v) < k ■ dc{u, v) 

for every pair {m, n} of connected vertices in G' . 

Denote the class of all {k,£)-self-spanners by SS{k,£). The parameter k is 
called stretch factor, the parameter £ is called fault-tolerance value of the 
class SS{k,£). 

2. For a graph G, minSi{G) denotes the smallest k such that G G SS{k,£), 
whereas maxTk{G) denotes the largest £ such that G G SS{k,£). 



208 



S. Cicerone, G. Di Stefano, and D. Handke 



As an example, consider the ‘opaque cube’ G as shown 
in Figure 1. G belongs to S'S'(3,1). Since minSi(G) = 

3 and maxT^{G) = 1, then G does not belong to 
SS{3, 2). Observe that the definition works equally well 
for connected and disconnected graphs; but it is ob- 
vious that we can restrict our analysis to connected gra- 
phs (otherwise we can deal with each connected compo- 
nent separately). Thus, in the following we only consider 
connected graphs. 

By similar arguments as in Remark HI it suffices to consider only faulty edges 
of each subgraph if we want to check whether a given graph belongs to SS{k, i): 
For any subgraph G' = {V,E') of G = {V,E) with \E'\ > \E\ - i and E' C E, 
we have to check if 

(*) dG'{u,v) < k for every e = {u,u} S E\E' . 

As a consequence, in the following, we only deal with integer stretch factors k. 
We can furthermore simplify the procedure to check whether a graph belongs to 
a class SS{k, i) : we do not have to consider all (possibly disconnected) subgraphs 
but only connected subgraphs. We get the following: 

Lemma 1. For any fixed integers k > 1 and £ > 0, G £ SS{k,£) if and only 
if every connected and spanning subgraph G' = (V,E') with \E'\ > m — £ and 
E' C E, is a k-spanner ofG. 

Proof. It suffices to show Suppose every connected spanning subgraph G' = 
(V, E') with \E'\ > m — £ and A' C if is a fc-spanner of G and, by contradiction, 
assume that G is not a (fc, £)-self-spanner. By definition, there is a subgraph 
G" = (V, E") with I if" I > m — £, E" C E (not necessarily connected) such that 
there is a pair of vertices u and v (within one connected component of G") and 
dc" (u,v) > k-dciu^v). Since G is connected, there is also a connected subgraph 
G = (V, E) with E" C if C if (and thus |if| > m — £) constructed as follows: Let 
C be the set of connected components of G". Obtain G from G" by adding |C| — 1 
bridge edges such that G is minimally connected. Then dg(u,u) > k ■ dG(u,v) 
and thus G is not a fc-spanner of G, a contradiction. □ 

This lemma motivates the name of the class by its strict relationship with the 
concept of fc-spanners: a graph of this class ‘spans itself’. Note that we cannot 
directly incorporate vertex failures. Consider for example again the ‘opaque cube’ 
as shown in Figure 1. As stated above, this graph is in SS{3, 1), but the graph 
G' obtained from removing the internal vertex is not (in fact, it has a stretch 
factor minSi{G) = 5, and thus is in SS{5, 1)). Hence, in the case of SS{k, £), we 
purely model edge failures. In the sequel, we use Lemma [I] as a characterization 
of the class SS{k,£). 

Remark 2. Note that the definition of {k, ^)-self-spanners does not imply that 
G is (f -I- l)-edge-connected. As stated above, we do not care for pairs of vertices 
(or edges) that are separated by the edge failures. If we want to take this into 
account (e.g., to achieve ‘true’ fault-tolerance, such that we can always guarantee 
for a ‘short’ connection between any pair of vertices even in the case of £ edge 




Fig. 1 Opaque cube. 
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failures) we can restrict our attention to graphs belonging to the intersection of 
the classes of (£+ l)-edge-connected graphs and (fc, £) -self-spanners. 

3.1 Characterization of (fe, £)— Self-Spanners 

We are interested in finding (strict) efficient characterizations for the class 
SS{k,i). For this aim, we start by stating some (more or less) straightforward 
results on (fc, f)-self-spanners and define the problems to be considered formally. 
As our main results, we establish an almost complete set of complexity results 
for these problems. Let us first consider some trivial cases. 

Lemma 2. 

1. No delay, i.e., k = 1: For alii > 0, ^^(l,^) is the set of all trees. If we omit 
the connectivity constraint then SS{l,i) is the set of all forests. 

2. No edge failure, i.e., f = 0: For all k > 1, SS{k,0) = 5'S'(1,0) is the set of 
all (connected) graphs. 

3. Weak delay constraints, i.e., large stretch factors: Any connected graph G 
belongs to SS{k, i) for all k > \V\ — 1, and for any i > 0. 

4 . Strong fault-tolerance constraints, i.e., large fault-tolerance values: Let G be 
any connected graph. If G belongs to SS{k, \E\ — \ V\ + 1) (for some k>l), 
then G also belongs to SS{k,£) for all £ > \E\ — \V\ + 1. 

Proof. Parts 1 and 2 are straightforward. To see Part 3, observe that if we allow 
for a stretch factor of fc > |P| — 1 then we do not really impose a distance 
constraint: any vertex of G may be used for a detour. It remains to proof Part 4. 
If G is a (fc, f)-self-spanner for £ = \E\ — \V\ + 1 then, in particular, every 
spanning tree of G is a fc-spanner of G. If more than \E\ — \V\ + I edges fail 
then the resulting subgraph is necessarily disconnected and thus (by Lemma [TJ 
no further constraints are imposed. □ 

Thus in the following, given a graph G we will only consider stretch factors of 
2 < k < |P| — 2 and fault-tolerance values ofl<£< |G| — |P|-|-1. The cases for 
small and large stretch factors and small fault-tolerance values can be considered 
trivial, whereas the case of large fault-tolerance values coincides with the case 
of fc-self-spanners (see [5|). By this lemma, it is clear that for every connected 
graph G there are some parameters k and £ such that G belongs to SS{k,£). 
Analogously, if we fix one of the parameters we can always find a feasible value 
for the other parameter. It is easy to see that (fc, £)-self-spanners have inductive 
properties with respect to the parameters as stated below. 

Lemma 3. 

1. Ifk< k' then SS{k,£) C SS{k',£). 

2. If £< £' then SS{k,£') C SS{k,£). 

Remark 3. The class SS{k,£) is not closed under subgraphs (cf. Figure 1). Also 
it is not closed under supergraphs in the following sense: If a graph G is in 
SS{k,£) for some fixed k and £ then there may be a supergraph of G on the 
same vertex set (i.e., a graph with additional edges) that does not belong to 
SS{k,£). The same still holds if we consider only (^-|- l)-edge-connected graphs. 
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As a consequence of the previous remark, the self-spanner properties of a graph 
cannot be inferred directly from the self-spanner properties of sub- or supergra- 
phs. For examples of standard graphs that exhibit some particular self-spanner 
properties, it is easy to see that G SS{l,i) for any £ > I because is a 
tree. Furthermore C„ G SS{n — l,t) but C„ ^ SS{n — 2,£) for any £ > 1, since 
minSt{Cn) = n — 1 for any £>1 (i.e., the failure of one edge results in a path 
of length n—1). 

Starting from the above observations, we are interested in finding non-trivial 
parameters such that a graph is a (fc, £)-self-spanner. This includes the problem 
of deciding for given parameters k and £ whether a given graph belongs to 
SS{k,£) as well as the more general recognition problems where we fix one 
of the parameters and try to optimize the other. This brings up the following 
optimization, resp. characterization problems: 

Problem 1. Minimum £-Stretch-Factor 
Given: A graph G and an integer fc > 1. 

Problem: Does G belong to SS{k,£), i.e., minSt{G) < kl 

Problem 2. Maximum /c-Fault-Tolerance 
Given: A graph G and an integer £ > 0. 

Problem: Does G belong to SS{k,£), i.e., maxTk{G) > £1 

Problem 3. General Self-Spanner 

Given: A graph G and two integers k >2,£> 1. 

Problem: Does G belong to S' S'(fc, £)? 

Thus, in Problem [T] we consider £ as a fixed parameter for the problem whereas 
in Problem |2 A: is a fixed parameter. We now turn to analyzing the complexity 
of the problems mentioned above. Let us first consider the special case where we 
allow for single edge failures only, i.e., £ = 1. 

Lemma 4. G G SS{k, 1) if and only if every edge of G is either a bridge or 
belongs to an induced cycle of length at most fc -|- 1. 

Proof. (<^=) Let e be an arbitrary edge of G and consider G' = G — e. We have 
to show property (*) of the remark after Definition [H If e is a bridge in G then 
G' is disconnected and there is nothing to show. If e is not a bridge then G' 
remains connected and by assumption G' is a fc-spanner of G. 

(=>) Let us assume G G I), and there is an edge e = {u,v} that is not 

a bridge and that does not belong to an induced cycle of length at most fc -|- 1. 
Consider G' = G — e. Then dc{u, v) > k, a, contradiction. □ 

Considering multiple edge failures, it is clear that bridges again do not contribute 
to the stretch factor. But unfortunately we cannot extend the characterization in 
a straightforward way. If we restrict ourselves to {£+ I)-edge-connected graphs 
we get the following lemma: 
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Lemma 5. Let G = {V,E) be {£+l)-edge-eonnected. Then G G SS{k,£) if and 
only if for every edge e = {tt,u} of G there are at least i edge disjoint paths (not 
involving e) of length at most k connecting u and v. 

Proof (<J=) Let G' = {V, E') he a. subgraph with E' C E and \E'\ > \E\—£, and 
e = {u, z;} G E\E' . There are ^ edge disjoint paths (not involving e) of length at 
most k connecting u and v. Thus, even if ^ — 1 edge failures happen to appear 
in one of these paths each, at least one covering path for e in G' remains. 

(=>) By contradiction, let us assume G G SS{k, £), and there is an edge e = {u, u} 
such that there are at most j < £ edge disjoint paths (not involving e) of length 
at most k connecting u and v. Since j is maximal, there are j edges within the 
edge disjoint paths such that the following holds: the subgraph G' constructed 
by deleting e and these selected edges remains connected (since G is {£ + 1)- 
edge-connected) but dG'{u,v) > k, a contradiction to G G SS{k,£). □ 

Observe that we cannot relax on the edge-connectivity constraint in this lemma. 
Consider for example the diamond consisting of a G 4 and one chord: this graph 
belongs to SS{3, 2) but it does not fulfill the constraints of Lemma 

Now, if we fix the fault-tolerance value £, we can determine the smallest 
stretch factor of a given graph in polynomial time (e.g., even by exhaustive 
search) : 

Theorem 1. Minimum ^-Stretch- Factor is in V for all £>0. 

As a consequence of this theorem we also have: 

Corollary 1. The problem of deciding whether a graph is a {k,£)-self-spanner 
for fixed k > 1 and £ > 0 is in P. 

Now, we consider the dual problem where we fix the stretch factor and we want 
to find the largest fault-tolerance value of a given graph. As stated in Lemma E] 
to solve this, it is crucial to find edge disjoint paths between any two vertices 
such that the paths have bounded length. Unfortunately, this problem is hard 
(cf. H, (ND41)): 

Problem j. Maximum Length-Bounded Disjoint Paths 
Given: A graph G, two vertices s and t, positive integers K, L < n. 

Problem: Does G contain L or more mutually edge disjoint paths from s to t, 

which all have length at most K1 

As shown in m, the problem is AfP-complete for all fixed AT > 5; it is poly- 
nomially solvable for AT < 3, and it is open for AT = 4. The proofs given there 
work for {£ + l)-edge-connected graphs as well. Observe that the problem of 
finding the maximum number of edge disjoint paths from s to t, under no length 
constrainf is solvable in polynomial time by standard network flow techniques. 
But this does not help here, since we need to guarantee the length constraints. 
Together with the observation that we can decide in polynomial time whether 
a given graph is {£ + l)-edge-connected, we get the following results, and only 
Maximum 4-Fault-Tolerance remains to be settled. 
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Theorem 2. 

1. Maximum fc-FAULT-TouERANCE is AfP-complete for all fc > 5. 

2. Maximum fc-FAULT-TouERANCE is in V for k = 1,2,3. 

3. General Self-Spanner is MV -complete. 



3.2 Graph Operations for Constructing (fe, £)— Self-Spanners 

Here we are interested in structural aspects of the class of (fc, f)-self-spanners. In 
particular, we want to find (easy) operations that allow for efficient construction 
of self-spanner networks or easy recognition of special cases. A common opera- 
tion is the Cartesian (or cross) product [HI, used to define several well-known 
network topologies. 

The next lemma shows that graphs that arise from the Cartesian product of 
two graphs have strong self-spanner properties. In particular, it indicates that a 
stretch factor of 3 plays an important role. 

Theorem 3. Let Gi,G 2 be two connected graphs, G = Gi x G 2 , and z = 1,2. 

1. If Gi € SS{ki,£i) and Gi is (G + V)~ edge- connected then 

G £ 5'S'(max{fci, fc 2 }, min{£i, ^ 2 })- 

2. Let S be the minimum vertex degree of both Gi and G 2 . 

Then G € SS{3, S) (in particular, G € SS(3, 1) J. 

3. G £ SS{2,£) if and only if every edge in Gi belongs to at least £ disjoint 
triangles within Gi. 

4- If Gi or G 2 contains a bridge then maxT 2 {G) = 0, i.e., there is no £ > Q 
such that G £ SS{2,£). In particular, if Gi or G 2 contains a bridge and 
G £ SS{k,£) for some £> 0 then fc > 3. 

Observe that, for Part 1 of the previous theorem, it is really necessary to claim 
the respective edge connectivity. Otherwise we cannot guarantee that the graph 
considered in the proof remains connected. Also, for Part 3 of that lemma, it 
does not suffice to claim that Gi G SS{2,£) (and G 2 G SS{2,£), resp.): we again 
need that both graphs are {£ l)-edge-connected. For smaller stretch factors, 
i.e., fc = 1, we already know that Gi x G 2 has a stretch factor smaller than 2 if 
and only if it is a tree. 

Remark 4- Part 2 of Theorem]^ is tight in the following sense: If Gi ^ SS{2, 1) 
and Gi has minimum vertex degree 6 for i £ {1,2}, then minSs{Gi x G 2 ) = 3 
and maxT^{Gi x G 2 ) = <5. Thus Gi x G 2 G SS{3,5), but G\ x G 2 4- SS{2,6) 
and Gi X G 2 ^ SS{3,S+1). 



3.3 Self-Spanner Properties of Some Popular Network Topologies 

As we have seen in the previous subsection, we can construct graphs that exhibit 
certain self-spanner properties by using the Cartesian product. We now follow the 
opposite approach and examine some network topologies that are used widely, 
with respect to their self-spanner properties. In particular, we consider mesh-like 
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networks like grid, torus, and hypercube. As examples for bounded-degree appro- 
ximations of the hypercube, we investigate cube connected cycles and butterfly 
networks (see m and the references therein). 

Grids, tori, and hypercubes. The topologies that are easiest in structure are the 
mesh-like networks, which can be constructed by application of the Cartesian 
product: The hypercube Hd is recursively defined from P 2 '. Hi = P 2 , and Hd = 
P2 X Hd-i for each d > 2; the grid Gn,m is the Cartesian product x Pm for 
n,m>2; the torus Tn,™ is the Cartesian product (7„ x Cm for n,m> 3. 

The following lemma indicates the self-spanner properties of these topologies. 
Observe that the fault-tolerance value of the torus is higher than that of the 
grid due to the additional wrap-around connections, which make the topology 
symmetric. But it is clear from Remark |3] that the addition of edges does not 
result in higher fault-tolerance values in general. 

Theorem 4. 

1- Gn,m belongs to SS{3, 1), but not to SS{2, 1). 

If n > 2 or m> 2 then Gn,m does not belong to SS{3,2). 

If n,m > 2 then Gn,m belongs to SS {5, 2), but not to SS{4:,2) or SS{5,3). 

2. Tn^m belongs to SS(3,2), but not to SS{2,2). 

If n > 3 or m > 3 then Tn^m does not belong to SS{3,3). 

Tn,m belongs to S'S'(min{5, max{n, m} — 1},3). 

If n,m > 5 then Tnm belongs to S'5'(5,4), but not to 5'S'(4, 4). 

If n,m > 5 then Tn^m does not belong to SS{5,5). 

3. Hd belongs to SS(3, d — 1), but not to SS{3, d) or to SS(2, 1). 

Hypercube derived networks. We now consider two different types of bounded- 
degree approximations of the hypercube. 

The cube- connected cycles graph of dimension d, denoted CCCd, is derived 
from Hd by replacing each vertex of Hd by a fundamental cycle of length d. Each 
vertex of such a cycle is labeled by a tuple {i,x),0 < i < d — 1, and i is called 
the level of the vertex. Apart from the cycle edges of the fundamental cycles, 
each vertex {i,x) is connected to vertex (i,x{i)), where x(i) denotes the vertex 
of Hd that is labeled by the same string as vertex x but with bit i flipped. These 
edges are called hypercube edges. 

The butterfly graph (with wrap-around) of dimension d, denoted Bd, is deri- 
ved similarly from Hd'. Bd consists of the same vertices {i,x),0 < i < d — 1, as 
CCCd and the same fundamental cycles of length d. But now each vertex (i,x) 
is connected by two hypercube edges to vertices {i 1, x{i)) and (i — 1, x{i — 1)). 

CCCd can be obtained from Bd by using a single edge {(i, x), {i, a:^(*))} instead 
of the pair of edges {(i, a:), (i-l-1, a;(*))} and {{i,x),{i — l, x(i — 1))}. Thus, CCCd 
can be viewed as a spanning subgraph of Bd. In [2, it is shown that different 
hypercube-derived topologies can be embedded within other such topologies with 
small slowdown. Results on the existence of cycles and the construction of k- 
spanners can be found in [14] and |9], respectively. But all these results do not 
imply on the self-spanner properties of the topologies studied here. We get the 
following results concerning the self-spanner properties of the topologies above: 
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Theorem 5. 

1. Bii belongs to SS{5, 1) and to SS{d +1,2), but not to SS{2, 1), SS{d, 2), or 
SS{d+l,5). 

2. CCCd belongs to SS(J,1) and to S'5(max{7, d — 1},2), but not to SS{6,1). 

The previous lemma shows that bounded-degree approximations of the hyper- 
cube like CCCd and Bd perform poorly with respect to their self-spanner pro- 
perties: In the case of single edge failures the stretch factor is still a constant 
(though much larger than for the hypercube), but for double edge failures the 
stretch factor grows linearly with the dimension d. Thus, the guarantees for 
delays in case of failures are really weak for these kinds of topologies. 
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Abstract. The main contribution of this work is to propose energy- 
efficient randomized initialization protocols for ad-hoc radio networks 
(ARN, for short). First, we show that if the number n of stations is 
known beforehand, the single-channel ARN can be initialized by a pro- 
tocol that terminates, with high probability, in 0{n) time slots with no 
station being awake for more than O(logn) time slots. We then go on 
to address the case where the number n of stations in the ARN is not 
known beforehand. We begin by discussing, an elegant protocol that pro- 
vides a tight approximation of n. Interestingly, this protocol terminates, 
with high probability, in 0((logn)^) time slots and no station has to be 
awake for more than O(logn) time slots. We use this protocol to design 
an energy-efficient initialization protocol that terminates, with high pro- 
bability, in 0{n) time slots with no station being awake for more than 
O(logn) time slots. Finally, we design an energy-efficient initialization 
protocol for the fc-channel ARN that terminates, with high probability, 
in 0(^ -I- logn) time slots, with no station being awake for more than 
O(logn) time slots. 



1 Introduction 

An ad-hoc radio network (ARN, for short) is a distributed system with no central 
arbiter, consisting of n radio transceivers, henceforth referred to as stations. We 
assume that the stations are identical and cannot be distinguished by serial or 
manufacturing number. We refer the reader to Figure |T] depicting a 7-station 
ARN. 

As customary, time is assumed to be slotted and all the stations have a 
local clock that keeps synchronous time, perhaps by interfacing with a GPS 
system. The stations are assumed to have the computing power of a usual laptop 

* Work supported in part by ONR grant N00014-91-1-0526 and by Grant-in-Aid for 
Scientific Research (C) (10680351) from Ministry of Education, Science, Sports, and 
Culture of Japan. 
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Fig. 1. Illustrating a 7-station ARN. 



computer; in particular, they all run the same protocol and can generate random 
bits that provide local data on which the stations may perform computations. 

The stations communicate using k radio frequencies (i.e. channels). We as- 
sume that in any time slot, a station can tune to one radio channel and/or 
transmit on at most one, possibly the same, channel. A transmission involves a 
data packet whose length is such that the transmission can be completed within 
one time slot. 

We employ the commonly-accepted assumption that when two or more sta- 
tions are transmitting on a channel in the same time slot, the corresponding 
packets collide and are lost. We further assume that the system has collision 
detection capability. Accordingly, at the end of a time slot the status of a radio 
channel is: 

NULL: if no station transmitted on the channel in the current time slot, 
SINGLE: if exactly one station transmitted on the channel in the current time 
slot, and 

COLLISION: if two or more stations transmitted on the channel in the current 
time slot. 

We assume that the stations in the ARN run on batteries and saving battery 
power is exceedingly important as recharging batteries may not be possible while 
in operation. It is well known that a station expends power while its transceiver is 
active, that is, while transmitting or receiving a packet. It is perhaps surprising at 
first that a station expends power even if it receives a packet that is not destined 
for it. Consequently, we are interested in developing protocols that allow stations 
to power Ojff their transceiver to the largest extent possible. Accordingly, we judge 
the goodness of a protocol by the following two yardsticks: 

— the overall number of time slots required by the protocol to terminate 

— for each individual station the total number of time slots when it has to 

transmit/receive a packet. 

The initialization problem is to assign to each of the n stations in the ARN 
an integer ID number in the range [l..n] such that no two stations are assigned 
the same ID. The initialization problem is fundamental in both network design 
and in multiprocessor systems m- 
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Recent advances in wireless communications and mobile computing have exa- 
cerbated the need for efficient protocols for ARNs. Indeed, a large number of such 
protocols have been reported in the literature 11121517181 . However, many of these 
protocols function under the assumption that the n stations in the ARN have 
been initialized in advance. The highly non-trivial task of assigning the stations 
distinct ID numbers, i.e. initializing the stations, is often ignored in the litera- 
ture. It is, therefore, of importance to design efficient initialization protocols for 
ARNs both in the case where the system has a collision detection capability and 
for the case where this capability is not present. 

Recently, Hayashi et al. [3] have reported an initialization protocol for an 
n-station, fc-channel ARN that terminates with high probability in O(^) time 
slots, provided that k < However, the protocol of [3] was not designed to 

be energy-efficient and the stations have to be awake for the entire duration of 
the protocol. 

Our first main contribution is to propose energy-efficient randomized initia- 
lization protocols for ARNs. First, we show that if the number n of stations is 
known beforehand, the single-channel ARN can be initialized by a protocol ter- 
minating, with high probability, in 0{n) time slots, with no station being awake 
for more than O(logn) time slots. 

Next, we consider the case where the number n of stations is not known befo- 
rehand. The key insight that leads to an energy-efficient solution in this scenario 
is an elegant strategy for finding a good approximation for n. Specifically, we 
propose an energy-efficient approximation protocol for n that terminates, with 
high probability, in 0((logn)^) time slots with no station being awake for more 
than O(logn) time slots. Using our approximation protocol, we show that even 
if n is not known the single-channel ARN can be initialized by a protocol ter- 
minating, with high probability, in 0{n) time slots, with no station being awake 
for more than O(logn) time slots. 

Finally, we extend this protocol to the fc-channel ARN. Specifically, we pro- 
pose an energy-efficient initialization protocol for an n-station, fc-channel ARN 
that terminates, with high probability in 0(^ -l-logn) time slots with no station 
being awake for more than O(logn) time slots. 

2 Basics 

The main goal of this section is to review a number of fundamental results in 
basic probability theory and to discuss simple prefix sums protocols that will be 
needed in the remainder of the paper. 



2.1 A Refresher of Basic Probability Theory 

Throughout, Pr[A] will denote the probability of event A. For a random variable 
X, E[X] denotes the expected value of X. Let A be a random variable denoting 
the number of successes in n independent Bernoulli trials with parameters p and 
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1 — p. It is well known that X has a binomial distribution and that for every r, 
(0 < r < n), 

Pr[X = r] = (1) 

Further, the expected value of X is given by 

n 

E[X] = ^ r • Pr[X = r] = np. (2) 

r— 0 

To analyze the tail of the binomial distribution, we shall make use of the following 
estimate, commonly referred to as Chernoff bound [^: 

Pr[X>{l + 5)E[X]]<^^^^^j iO<S). (3) 

We also use the following estimates easily derived from ([3}: 

Pr[X < (1 - e)F;[X]] < (0 < e < 1). (4) 

Pr[X > (l + e)F;[X]] < (0 < e < 1). (5) 



2.2 Prefix Sums Protocols 

Suppose that the ARN has n stations each of which has a unique ID in the range 
[I..m] (m > n). Let Si denote a station with ID i (I < i < m). Note that for 
some i, Si may not exist. Suppose that each station Si has a real number Xi. 
The prefix sum of Xi is the sum of the real numbers with indices no more than 
i, that is, 

Xj \ 1 < j < i and Sj exists}. 

The prefix sums problem asks to compute all prefix sums. A naive protocol 
can solve the prefix sums problem in m — 1 time slots in the one-channel ARN 
as follows: In each time slot i (1 < i < m — 1), station Si transmits Xi on the 
channel, and every station Sj {i < j < n) monitors the channel. By summing the 
real numbers received, each station learns the prefix sum. However, this protocol 
is not energy efficient, because the last station Sm monitors the channel in all of 
the m — I time slots. 

When the protocol terminates the following three conditions are satisfied: 

(psl) Every active station Si, (I < i < n), stores its prefix sum. 

(ps2) The last station, that is, station Sk such that no station Si with i > k 
exists, has been identified 

(ps3) The protocol takes 2m — 2 time slots and no station is awake for more 
than 2 log m time slots. 
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If m = 1, then Si knows x\ and the above conditions are verified. Now, 
assume that m> 2. Partition the n stations into two groups Vi = {Si \ 1 < i < 
and V 2 = {Si \ ^ + m}. Recursively compute the prefix sums in Vi 

and V 2 - By the induction hypothesis, conditions (psl)-(ps3) above are satisfied 
and, therefore, each of the two subproblems can be solved in m — 2 time slots, 
with no station being awake for more than 2 log m — 2 time slots. Let Sj and 
Sk be the last active stations stations in Vi and in V 2 , respectively. In the next 
time slot, station Sj transmits the sum I 1 ^ ^ J and Si exists} on the 

channel. Every station in V 2 monitors the channel and updates the value of its 
prefix sum. In one additional time slot station Sk reveals its identity. The reader 
should have no difficulty to confirm that the protocol satisfies (psl)-(ps3) above 
are satisfied. Thus, we have. 

Lemma 1. The prefix sums problem on the single- channel ARN of n stations 
with unique ID [l..?7r] (m >n) can be solved in 2m — 2 time slots with no station 
being awake for more than 2 log m time slots. 

Next we extend the single-channel prefix sums protocol to a fc-channel ARN. 
We begin by partitioning the stations into k equal-sized subsets Xi,X 2 , ■ ■ ■ , Xk 
such that Xi = {Sj \ (t — 1) • ^ + 1 < j < * • ^ and Sj exists}. By assigning one 
channel to each subsequence, the prefix sums within each Xi can be computed 
using the single-channel prefix sums protocol discussed above. This needs 2^ — 2 
time slots with no station being awake for more than 2 log ™ time slots. At this 
point, we have the local sum sum(Ai) of each Xi and we need to compute the 
prefix sums of sum(Ai), sum(A 2 ), . . . , sum(Afc). This can be done by modifying 
slightly the single-channel prefix-sums protocol. Recall that in the single-channel 
protocol, the prefix sums of V\ is computed recursively, and then that for V 2 is 
computed recursively. Since k channels are available, the prefix sums of Vi and 
V 2 are computed simultaneously using | channels each. After that, the overall 
solution can be obtained in two more time slots. Using this idea, the prefix sums 
can be computed in 2 ^ -|- 2 log k — 2 with no station being awake for more than 

2 log ^ +2 log k = 2 log 771 time slots. To summarize, we have proved the following 
result. 

Lemma 2. The prefix sums problem on the k-channel ARN with n stations each 
having a unique ID (m > n) can be solved in 2^ + 21ogfc — 2 time slots 

with no station being awake for more than 21ogm time slots. 

3 An Energy-EfRcient Initialization for Known n 

Suppose that each station knows the number n of stations. The purpose of this 
section is to show that with probability exceeding 1 — the n-station 

ARN can be initialized in 0(n) time slots with no station being awake for more 
than O(logn) time slots. 

The protocol uses the initializing protocol discussed in |3] which was not 
designed under energy efficiency in mind. 
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Lemma 3. Even if n is not known beforehand, an n-station, single- channel 
ARN can be initialized in at most 16n time slots with probability at least 1 — 

2 — 1. 62n 

In paper [3], the constant factor of the time slots is not evaluated explicitly. 

The details of our energy efficient initialization protocol are spelled out as 
follows. 



Protocol Initialization 

Step 1 Each station selects uniformly at random an integer i in the range 
Let h{i) denote the set of stations that have selected integer i\ 

Step 2 Initialize each set h{i) individually in 64 log n time slots; 

Step 3 Let Ni denote the number of stations in h{i). By computing the prefix 
sums of Ni, N 2 , . . . , N n every station determines, in the obvious way, its 
global ID within the ARN. 



Clearly, Step 1 needs no transmission. 

Step 2 can be performed in 64n time slots using the protocol for Lemma |3] 
as follows: the stations in group h{i), (1 < j are awake for 641ogn time 

slots from time slot (i — 1) • 64 log n + 1 to time slot i ■ 64 log n. Outside of this 
number of slots all the stations in group h(i) are asleep and consume no power. 

As we are going to show, with high probability, every group h(i) contains 
at most 41ogn stations. To see that this is the case, observe that the expected 
number of stations in h(i) is A[|/i(i) |] = n x = log n. Now, using the Chernoff 
bound in we can write 

Pr[|/i(i)| > 41ogn] = Pr[|/i(i)| > (1 + 3)A[|/i(i)|]] 

< (by ® with 5 = 3) 

< (since log ^ = —3.67...) 

It follows that the probability that group h(i) contains more than 41ogn stations 
is bounded above by Hence, with probability exceeding 1 — n • = 

1 — none of the groups h{l), h{2 ), . . . , contains more than 41ogn 

stations. By Lemma[3 with probability exceeding 1 — 2~i®2x4iogn _ ^^-6.48 

the stations in group h{i) can be initialized in 16 x 41ogn = 64 log n time slots. 
Thus, with probability exceeding 1 — > 1 — 0(n“^ ®^) all the 

groups h{i) will be initialized individually in 64 log n x = 64n time slots, 
with no station being awake for more than 64 log n time slots. 

Let Pi, (1 < z < p,^), denote the last station in group h{i). At the end 
of Step 2, each station Pi knows the number Ni of stations in h{i). Step 3 can 
use the stations Pi to compute the prefix sums of Ni, N 2 , . . . , N n . The prefix 

__ log n 

sums protocol discussed in Section |2] will terminate in 2j^^ — 2<2n time slots 
with no station being awake for more than 2 log < 2 log n time slots. To 
summarize, we have proved the following result. 
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Theorem 4. If the number n of stations is known beforehand, with probability 
exceeding 1 — an n-station, single- channel ARN can be initialized in 

0{n) time slots, with no station being awake for more than O(logn) time slots. 



4 Finding a Good Approximation of n 

At the heart of our energy-efficient initialization protocol of an ARN where 
the number n of stations is not known beforehand lies a simple and elegant 
approximation protocol for n. Specifically, this protocol returns an integer I 
satisfying, with probability at least 1 — the double inequality: 

log n — log log n — 4 < / < log n — log log n -I- 1. (6) 

Once a good approximation of n is available, we can use the protocol discussed 
in Section [3 to initialize the entire ARN. 

The details of our approximation protocol is spelled out as follows: 



Protocol Approximation 

Each station generates uniformly at random a real number x in (0, 1] 

Let g{i), {i > 1), denote the group of stations for which ^ < x < 

for i ^ 1 to oo do 

protocol Initialization is run on group g{i) for 128* time slots; 
if the initialization is complete and if g{i) contains at most 8* stations 
then the first station in g{i) transmits an “exit” signal. 

end for 



Let us first evaluate the number of time slots it takes protocol Approximation 
to terminate. Let / be the value of i when the for-loop is exited. One iteration 
of the for-loop takes 128* -I- 1 time slots. Thus, the total number of time slots is 
at most 

I 

^128*-k 1 < 64(7 -k 1)2. 

i=l 

Next, we evaluate for each station the maximum number of time slots during 
which it has to be awake. Clearly, each station belongs to exactly one of the 
groups g{l),g{2), . . ., and therefore, every station is awake for 128* < 1287 time 
slots. Further, all the stations must monitor the channel to check for the “exit” 
signal. Of course, this takes an additional 7 time slots. Thus, no station needs 
to be awake for more than 1297 time slots. 

Our next task is to show that, with high probability, 7 satisfies (E|. For this 
purpose we rely on the following technical results. 

Lemma 5. If i satisfies 1 < * < logn — log log r* — 4 then Pr[|g(*)| > 8*] > 
l-r*-2-88. 
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Proof. Clearly, i < log n — log log n — 4 implies that 

Ti T1 

2® < and similarly 16i < 16 log n < — . 

16 log n 2* 

Since g{i) consists of stations generating a real number x satisfying ^ < x < 
the expected number of stations is if[| 5 (i)|] = Using the Chernoff bound in 
with e = |, we can evaluate the probability Pr[|g(i)| < 8i] that group g{i) 
contains strictly less than 8z stations as follows: 

X Tl 

Pr[| 5 (i)| < 8i] < Pr[|g(i)| < (1 - -)— ] (since 16i < 

< ^ (by (gj e = |) 

< (since 161ogn < 

< (since log(e“^) = —2.88 . . .) 

as claimed. 

Lemma 6. If i = [logn — log log nj + 1 then Pr[|g(i)| < 8i] > 1 — 

Proof. If i = [log n — log log nj + 1 then clearly, 

log n — log log n < i < log n — log log n + 1. 

This also implies that 

7Z OiTl % l0£ Ti Ti 

< 2* < and that - < — — < ;^ < log n <i + log log n < 2i. 

logn “ logn 2 “ 2 “ 2* ^ ^ ^ 

Using these relations, we can evaluate the probability Pr[| 5 (i)| > 8i] that group 
g{i) contains strictly more than 8i stations as follows: 

Ti 

Pr[l5(*)l > 8i] < Pr[|g(z)| > 4—] (since ^ < 2i) 

< (^) 2 ' (from {HI with (5 = 3) 

<(^)"^ (from^<§) 

< (from log(|^)5 = —1.83 .. .) 

This completes the proof. 

LemmaO and [6] combined, yield the following result. 

Lemma 7. The value I ofi when the tor-loop is exited satisfies, with probability 

at least 1 — 0(n“^ ®^), condition 

Thus, we have proved the following important result. 

Theorem 8. Protoc(3Z Approximation returns an integer I satisfying, with pro- 
bability at least 1 — 0(n“^ ®^), condition Moreover, the protocol terminates 
in 64(logn)^ time slots with no station is awake for more than 129 logn time 
slots. 
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5 An Energy- Efficient Initialization for Unknown n 



The main goal of this section is to present an energy-efficient initialization pro- 
tocol in the case where the number of stations in the ARN is not known. 

Recall that protocol Initialization, partitions the n stations into 
groups and initializes each group independently. Since n is unknown, this par- 
titioning cannot be done. Instead, using protocol Approximation, we find an 
integer / satisfying condition with probability at least 1 — 0{n After 

that, execute Initialization with 2^+^ groups h{l), h{2), . . . , Observe 

that condition is satisfied. Then, we can write 



log n 
32 



n 

2^+4 



< log n. 



(7) 



In this case by Theorem [dj with probability 1 — 64 log n time slots 

are sufficient to initialize a particular h{i). However, since n is unknown, we 
cannot arrange 64 log n time slots to each group. Instead, we assign 128(/ -I- 4) 
(> 641ogn) time slots. As a result, with probability at least 1 — 0(2^+4-n“^ ®^) > 
1 — all of the groups can be initialized locally in 2-^+4. x28(/-|-4) = 0(n) 
time slots with no station being awake for more than 128(7 -|-4) = O(logn) time 
slots. After that, the prefix sums are computed in 2 • 2^+4 — 2 = O(p^) time 
slots with no station being awake for more than 21og(2^+4) = O(logn) time 
slots. Since condition is satisfied with probability at least 1 — 0{n ^ we 
have 



Theorem 9. Even if the number n of stations is not known beforehand, with 
probability exceeding 1 — an n-station, single-channel ARN can be 

initialized in 0{n) time slots, with no station being awake for more than O(logn) 
time slots. 



6 An Energy-Efficient Initialization for the fc-Channel 
ARN and Unknown n 

The main purpose of this section is to present an energy-efficient initialization 
protocol for the n-station fc-channel ARN, when n is not known beforehand. Let 
C(l), C(2), . . . , C{k) denote the k channels available in the ARN. 

The idea is to parallelize the protocol in Section El to take advantage of 
the k channels. Recall that this initialization protocol first executes protocol 
Approximation and then execute protocol Initialization. 

We first extend protocol Approximation for the case where k channels are 
available. Having determined groups g{l), g{2), . . . , g{I), we allocate groups to 
channels. For every i, {I < i < k), we allocate group g{i) to channel C{i) and 
attempt to initialize group g{i) in 128f time slots. If none of these attempts is 
successful, we allocate the next set of k groups g{k l),g{k -I- 2), . . . ,g(2k) to 
the k channels in a similar fashion. This is then continued, as described, until 
eventually one of the groups is successfully initialized. We now estimate the 
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number of time slots required to get the desired value of I. Let c be the integer 
satisfying cfc + 1 < / < (c+ l)fc. In other words, in the c-th iteration, I is found. 
If c > I, then the total number of time slots is 

/ (logn)^ 

V k 

If c = 0, then the total number of time slots is 128(7 + 4) = O(logn). There- 
fore, the k channel version of Approximation terminates, with high probability, 

in O ^ — hlogn^ time slots, with no station being awake for more than 

O(logn) time slots. 

Similarly, we can extend protocol Initialization for the 7-channel case as 
follows: Recall that we need to initialize each of the groups h(l), 7(2), . . . , 7(2^+'*). 
This task can be extended as follows: Since we have k channels, groups can 
be assigned to each channel and be initialized efficiently. Since each group h{j) 
can be initialized in 128(7-1-4) time slots, all of the 2^+"* groups can be initialized 
in 128(7 -I- 4) • ^ = O(^) time slots, with no station being awake no more than 
O(logn) time slots. After that we use the fc-channel version of the prefix sums 
protocol discussed in Section |2I This takes 2 ^ — h 2 log 7 — 2 < -I- 2 log 7 

time slots, with no station being awake for more than 2 log ^ -1-2 log 7 time slots. 
Therefore, we have the following result: 

Theorem 10. Even if the number n of stations is unknown, with probability 
exceeding 1 — 0(n“^ ®^), an n-station, k-channel ARN can be initialized in 0(^-1- 
logn) time slots, with no station being awake for more than O(logn) time slots. 



1287 -k 128(27) -k 128(37) -k • • • -k 128(c7) = 0{c^k) = O 



= O 
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Abstract. The problem of constructing the suffix tree of a common 
suffix tree (CS-tree) is a generalization of the problem of constructing 
the suffix tree of a string. It has many applications, such as in minimizing 
the size of sequential transducers and in tree pattern matching. The 
best-known algorithm for this problem is Breslauer’s 0(nlog|T'|) time 
algorithm where n is the size of the CS-tree and lYI is the alphabet size, 
which requires 0(n log n) time if lYI is large. We improve this bound by 
giving an 0(n log log n) algorithm for integer alphabets. For trees called 
shallow fe-ary trees, we give an optimal linear time algorithm. We also 
describe a new data structure, the Bsuffix tree, which enables efficient 
query for patterns of completely balanced k-axy trees from a fc-ary tree 
or forest. We also propose an optimal 0(n) algorithm for constructing 
the Bsuffix tree for integer alphabets. 



1 Introduction 

The suffix tree of a string S G if” is the compacted trie of all the suffixes of 
5"$ ($ ^ E). This is a very fundamental and useful structure in combinatorial 
pattern matching. Weiner m introduced this structure and showed that it can 
be computed in 0{n\S\) time, where |if| is the alphabet size. Since then, much 
work has been done on simplifying algorithms and improving bounds mm , 
with algorithms achieving an 0(n log |if|) computing time (see also |8] for de- 
tails) . Recently, Farach [5] proposed a new algorithm that achieved a linear time 
(independent from the alphabet size) for integer alphabets. 

A common suffix tree, or a CS-tree for short, is a data structure that repre- 
sents a set of strings. This is also an important problem that appears in tasks 
such as minimizing sequential transducers of deterministic finite automata j2] 
and tree pattern matching m- Kosaraju m mentioned that the generalized 
suffix tree of all the suffixes of a set of strings represented by a CS-tree can be 
constructed in 0(n log n) time where n is the size of the CS-tree. Breslauer |2] 
improved this bound by giving an 0(n log |if|) algorithm. Note that both of the 
algorithms were based on Weiner’s suffix tree construction algorithm |1 ,3J . But 
this algorithm becomes 0(n log n) when E is large. In this paper, we improve 
their bound by giving an 0(n log log n) algorithm for integer alphabets. Shallow 
trees are trees such that their depths must be at most clogn, where n is the size 
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of the tree and c is some constant. We give an optimal 0{n) algorithm for trees 
called shallow fc-ary trees, for constant k. 

We also deal with a new data structure called a Bsuffix tree, which is a 
generalization of the sufhx tree of a string. Using the suffix tree of a CS-tree, 
we can find a given path in a tree very efficiently. The Bsuffix tree is a data 
structure that enables us to query any given completely balanced k-aiy tree 
pattern from a A:-ary tree or forest very efficiently. Note that the concept of a 
Bsuffix tree is very similar to that of an LsufRx tree m. which enables us to 
query any square submatrix of a square matrix efficiently. We will show that this 
data structure can be built in 0{n) time for integer alphabets. Bsuffix trees have 
many useful features in common with ordinary suffix trees. For example, using 
this data structure, we can find a pattern (a completely balanced fc-ary tree) in a 
text fc-ary tree in 0(m log m) time, where m is the size of the pattern. Moreover, 
we can enumerate common completely balanced fc-ary subtrees in a linear time. 
Considering that general tree pattern matching requires a O(nlog^n) time [1], 
these results mean that a Bsuffix tree is a very useful data structure. 

2 Preliminaries 

2.1 The Suffix Tree 

The suffix tree of a string S G U" is the compacted trie of all the suffixes of S$ 
($ ^ S). The tree has n + 1 leaves and each internal node has more than one 
child. Each edge is labeled with a non-empty substring of S$ and no two edges 
out of a node can have labels which start with the same character. Each node is 
labeled with the concatenated string of edge labels on the path from the root to 
the node, and each leaf has a label that is a different suffix of S$. Because each 
edge label is represented by the first and the last indices of the corresponding 
substring in S%, the data structure can be stored in 0(n) space. In this paper, 
we deal with only the suffix trees in which the edges going out from a node are 
sorted according to their labels. Notice that this property is very convenient for 
querying substrings. 

For this powerful and useful data structure, we have the following theorems: 

Theorem 1 (Farach |j5|). The suffix tree of a string S G {!,..., n}" can he 
constructed in 0(n) time. 

Note that alphabet {1, . . . ,n} is called an integer alphabet. In this paper, we 
will deal with only integer alphabets. Farach’s suffix tree construction algorithm 
and our algorithms to be presented use the following theorem: 

Theorem 2 (Harel and Tarjan [9]). For any tree with n nodes, we can find 
the lowest common ancestor of any two nodes in a constant time after 0{n) 
preprocessing if the following values can be obtained in a constant time: bitwise 
AND, OR, and XOR of two binary numbers, and the positions of the leftmost 
and rightmost l-bit in a binary number. 

This theorem indicates that the longest common prefix (LCP) of any two suf- 
fixes can be obtained from the suffix tree in a constant time after linear-time 



preprocessing. 
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Fig. 1. CS-tree of the strin g s 1413$, 5413$, 913$, 56213$, 3213$, 5213$, and 83$ 



2.2 The Suffix Tree of a Tree 

A set of strings {Si, Sk}, such that no string is a suffix of another, can be 
represented by a common suffix tree (CS-tree for short), which is defined as 
follows: 

Definition 1 (CS-tree). In the CS-tree of a set of strings {51, . . . , ^fc}, each 
edge is labeled with a single character, and each node is labeled with the conca- 
tenated string of edge labels on the path from the node to the root. In the tree, 
no two edges out of a node can have the same label. Furthermore, the tree has k 
leaves, each of which has a different label that is one of the strings. Si. 

Figure [I] shows an example of a CS-tree. The number of nodes in the CS-tree 
is equal to the number of different suffixes of strings. Thus, the size of a CS- 
tree is not larger than the sum of the lengths of the strings represented by the 
CS-tree. Note that the CS-tree can be constructed easily from strings in a time 
linear to the sum of the lengths of the strings. 

The generalized suffix tree of a set of strings |5i, . . . , Sk} is the compacted 
trie of all the suffixes of all the strings in the set. As mentioned in m, the 
suffix tree of a CS-tree is the same as the generalized suffix tree of the strings 
represented by the CS-tree. Furthermore, the size of the generalized suffix tree 
is linear to that of the CS-tree, because the number of leaves of the suffix tree 
is equal to the number of edges in the CS-tree. Note that the edge labels of the 
suffix tree of a CS-tree corresponds to a path in the CS-tree, and they can be 
represented by the pointers to the first edge (nearest to the leaves) and the path 
length. 

Let Hi be the length of Si, and let N = J^i Let n be the number of nodes 
in the CS-tree of the strings. The generalized suffix tree can be obtained in 0{N) 
time in the case of integer alphabets {i.e.. Si G {!,... ,n}"*) as follows. First, 
we construct the suffix tree of a concatenated string of 5i$52$ ■ ■ • $5^ using Fa- 
rach’s suffix tree construction algorithm. Then, we obtain the generalized suffix 
tree by cutting away the unwanted edges and nodes. But N is sometimes much 
larger than the size n of the CS-tree: for example, there exists a tree for which 
N is 0{n^). This means that the 0{N)-time suffix tree construction algorithm 
given above is not at all a linear time algorithm. The best-known 0(nlog|A|) 
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algorithm [2| for this problem is based on Weiner’s suffix tree construction al- 
gorithm m- We will improve it by giving a new algorithm based on Farach’s 
linear-time suffix tree construction algorithm. 



3 New Algorithm for Constructing the SufRx Tree of a 
CS-Tree 

3.1 Algorithm Outline 

Our approach to constructing the suffix tree of a CS-tree is based on Farach’s 
suffix tree construction algorithm jS]. Farach’s algorithm has three steps. First, 
it constructs a tree called an odd tree recursively. Next, it constructs another 
tree called an even tree by using the odd tree. Finally it constructs the suffix 
tree by merging these two trees. Note that the odd tree is a trie of suffixes 
S[2i — 1] . . . S'[n]$, and the even tree is a trie of suffixes S[2i] . . . S'[n]$. This 
algorithm achieves an 0(n) computation time for integer alphabets. 

We later also define the odd and even trees for the suffix tree of a CS-tree, 
and our algorithm also has three following similar steps. First we build the odd 
tree or the even tree recursively, then we construct the even or odd tree by using 
the odd or even tree, respectively, and finally we merge them to construct the 
suffix tree. 

In the algorithm, we use the following theorems: 

Theorem 3. In any tree with n nodes, for any node v in the tree and any integer 
d > 0 that is smaller than the depth of v, we ean find the ancestor of v whose 
depth is d in O(loglogn) time after 0{n) preprocessing, if the following values 
can be obtained in a constant time: OR, shift of any i bits, the number of 1 -bits 
in the binary number, and any ith-leftmost 1-bit in the binary number. 

Theorem 4. In any shallow k-ary binary tree with n nodes where k is constant, 
for any node v in the tree and any integer d > 0 that is smaller than the depth 
of V, we can find the ancestor of v whose depth is d in a constant time after 
0{n) preprocessing, if any i-bit shift of a binary number can be performed in a 
constant time. 

Proofs of these theorems are given in the appendix. See section [T| for the defini- 
tion of a ‘shallow tree’. Let Tiookupin) be the time needed to compute a node’s 
ancestor of depth d after 0{n) preprocessing. 

Let us now define several notations. Let {51, . . . , Sk} be the strings represen- 
ted by a given CS-tree. Let be the length of Si and let Si = Si[nk] . . . 5i[l]. 
Note that the indices are arranged in reverse order. Above theorems El and 0] indi- 
cates that, for any i and j, we can access 5i[j] in Tiookup{n) time after 0{n) pre- 
processing. Let Si{m) be Si's, suffix of length m, i.e., Si[m] . . . 5i[l]- Let lcp{S, S') 
and lcs{S, S') be the lengths of the longest common prefix and suffix of strings S 
and S', respectively. Let parentjj{v) be the parent node of v in the CS-tree U if 
V is not the root node t; otherwise, let it be t: i.e., parentij{vij) = Wi,max(o,i-i)- 
Let label{e) be the label given to edge e in the CS-tree. Let Tjj be the suffix tree 
of the CS-tree U. 
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3.2 Building a Half of the Suffix Tree Recursively 

All nodes in the CS-tree U = {V^E) have either odd or even label length. Let 
Vodd and Veven be the nodes with odd label lengths and those with even label 
lengths, respectively. If ^ |lAi;en|j Vsmall — ^even and Vlarge — ^odd] 

otherwise, let Vsmaii = Vodd and Viarge = Veven- We can obtain \ Vodd\ and \ Veven\ 
in 0{n) time by the ordinary depth-first search on the CS-tree. Therefore, we 
can determine in a linear time which node set is Vsmaii- In this subsection, we 
will recursively construct the compacted trie Tsmaii of all the labels of nodes in 
Vsmaii- Note that the technique for constructing Tgmaii is very similar to that 
for constructing the odd tree in Farach’s algorithm. 

Consider a new CS-tree U' = {Vsmaii, E small), where E small = {{v.parentw = 
parentjj{parentij{v)))\v € Vsmaii, v ^ t} and the edge labels are determined as 
follows. Radix sort the label pairs pair{v) = {label{{v,parentu{v))), 
label{{parentij{v),parentij{parentij{v))))) for all v € Vsmaii\t and remove du- 
plicates, where label{e) denotes the label of an edge e in the original CS-tree 
U (let label{t,t) = (p ^ {N’,}). Let rank{v) be the rank of pair{v) in the 
sorted list, which belongs to an integer alphabet [l,n/2] because the size of 
the new tree lE is not larger than half of that of the original CS-tree U. Let 
origjpair{i) be a label pair pair{v) such that rank{v) = i. Let the label of an 
edge {v,parentu'{v)) G E small be rank{v). Notice that all of these procedures 
can be performed in a linear time. 

We then construct the suffix tree Tf// of U' by using our algorithm recursively. 
After that, we construct Tsmaii from Tj// as follows. We can consider a tree T' 
whose edge labels of Tij, are modified to the original labels in U\ for example, 
if the label of an edge in Ty/ is ijk, the label of the corresponding edge in T' 
is origjpair{i),orig-pair{j),origjpair{k). Notice that this modification can be 
performed by making only a minor modification of the edge label representation 
and that it takes only linear time. 

We can construct Tsmaii from T' very easily. T' contains all the labels of 
nodes in Vsmaii, but is not the compacted trie: the first characters of labels of 
outgoing edges from the same node may be the same. But the second character 
is different, and the edges are sorted lexicographically. Thus we can change T' 
to Tsmaii by making only a minor adjustment: we merge such edges and make 
a node, and if all the first characters of all the labels of edges are the same, we 
delete the original node. 

In this way, we can construct Tgmaii in n T{n/2) + 0{n) time, where T{n) is 
the time our algorithm takes to build the suffix tree of a CS-tree of size n. 

3.3 Building the Other Half of the Tree 

In this section, we show how to construct the compacted trie Tiarge of all the 
labels of nodes in Viarge from Tsmaii in a linear time. The technique is a slightly 
modified form of the second step of Farach’s algorithm, which constructs the 
even tree from the odd tree. 

If we are given an lexicographic traverse of the leaves of the compacted trie 
(which is called lex-ordering in |5]), and the length of the longest common prefix 
of adjacent leaves, we can reconstruct the trie m- Note that it can be done in 
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linear time in the case of constructing the sufhx tree of a string. We will obtain 
these two parts of Tiarge from T^maii , and construct Tiarge in the same way. But 
this method can obtain only the label length from the leaf or root for each node 
of the compacted trie. Recall that each label is represented by the first node and 
the label length in our case. Thus we must obtain that node from its specified 
depth and its some descendant leaf, which requires Tiookup{n) time. Hence the 
total time required by this procedure is 0 {nTiookup{n)) ■ 

Any leaf in Tiarge, except for those with labels of only one character, has a 
label consisting of a single character followed by the label of some corresponding 
leaf in Tsmaii- We can obtain the lex-ordering of the labels of leaves in Tsmaii 
by an in-order traverse of Tsmaii which takes only a linear time. Thus we can 
obtain the lex-ordering of the labels of leaves {Si{m)) in Tiarge by using the radix 
sorting technique, because we have Si [m] and the lexicographically sorted list of 
Si{m - 1 ). 

The longest common prefix length of adjacent leaves of Tiarge can also be 
obtained easily by using Tsmaii- Let Si{m) and Sj{n) be the labels of two ad- 
jacent leaves in Tiarge- If Si[m\ ^ the longest common prefix length is 0. 

Otherwise, it is l + lcp{Si{m— 1 ), Sj{n— 1 )) which can be obtained in a constant 
time from Tsmaii after linear-time preprocessing on Tsmaii (see Theorem HI). In 
this way, we can construct Tiarge from Tgmaii in 0 {nTiookup{n-)). According to 
Theorems 0 and [ 4 [ it is O(nloglogn) for general CS-trees, and 0(n) for shallow 
fc-ary CS-trees (fc: constant). 



3.4 Merging the Trees 

Now we have two compacted tries Todd and Teven- In this subsection, we merge 
Todd and Toven to construct the target suffix tree Tjj- We call the compacted 
trie of odd/even-length suffixes of strings the generalized odd/even tree of the 
strings. The odd/even tree of a CS-tree is also the generalized odd/even tree of 
the strings represented by the CS-tree. Farach’s algorithm merges the odd and 
even trees in a time linear to the sum of the sizes of odd and even trees. It can 
be directly applied also to our problem of merging generalized odd and even 
trees. Note that the merging can be done a linear time in Farach’s algorithm, 
but requires 0 {nTiookup{n)) time in our case. The outline of the algorithm is as 
follows. 

First, we merge the even and odd trees as following by considering that one 
of two edge labels is a prefix of the other label if the first characters of labels 
of two edges are the same. Let edges ei = (u,ui) and C2 = (u, U2) be the edges 
which starts from the same node v and the same first character. Let l\ and I2 
be the label lengths of e\ and 62, respectively. Without loss of generality, we 
let l\ > I2- Then we construct a internal node uj between v and v\ \i l\ > I2, 
otherwise let be v\. In case that l\ > I2, let the label of edge {v,v[) be the 
first I2 characters of the label of original edge (u,ui) and let the label of edge 
(uj, ui) be the last l\ — 12 characters of the label of original edge (u, ui). Then we 
merge two edge (u,vj) and 62. Note that this merging requires Tiookup(n) time 
because we must find the node which corresponds to the first character of new 
edge (u(, ui). We merge recursively all over the two trees by the normal coupled 
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depth first search. Thus the total computing time required for the merging is 
O ijlTlookupij^y) • 

Next, we unmerge the edges with different labels because we have merged 
edges too far. In this unmerging stage, we first compute the longest common 
prefix length of any merged pair of node labels in the suffix tree. Farach 
showed that all the required longest common prefix lengths for all the merged 
pairs can be obtained in 0 (n) time by using a data structure called d-links; for 
details of d-links and the algorithm, see [ 5 j. Using the common prefix length 
of merged nodes, we can easily determine how far to unmerge edges. For each 
unmerged edge, we must find the node of the CS-tree that corresponds to the first 
character of its label, which requires Tiookup{n) time. Thus the total computing 
time for unmerging is also 0 {nTiookup{n)). 

Hence the step of our algorithm for merging trees takes a total of 
0 {nTiookup{n)) time. Thus we obtain an equation T{n) = T{n / 2 )+ 0 {nTiookup{n)) , 
where T{n) is the time needed to construct the suffix tree of a CS-tree of size 
n. Therefore, our algorithm achieves a T{n) = 0 (n log log n) computing time for 
general CS-trees, and an optimal T(n) = 0 {n) computing time for shallow fc-ary 
CS-trees. 

4 The BSufRx Tree 

In this section, we propose a new data structure, the Bsuffix tree, which ena- 
bles efficient queries of completely balanced binary trees from any binary forest 
(including a single tree). It can also be used for querying completely balanced 
k-aiy subtrees from any fc-ary forest (fc need not be constant in this case), but 
we will deal with binary trees at first. The Bsuffix tree is a data structure for 
matching of nodes, but it can be also used for matching of edges (see subsection 

E3D. 

4.1 Definition of the BSuffix Tree 

Consider a completely balanced binary tree P of height h. Let pi,p2, . . . 
be the nodes of P in breadth- first order, and let Ci G {!,..., n} be the al- 
phabet given for node pi. Note that P[i/2J is the parent of pi in this order. 
We call C1C2 • • • C2h-i the label of P. We call substring C2> • • • C2t+i_i of this la- 
bel a Bcharacter. Furthermore, we call a string of Bcharacters a Bstring. For 
Bstring 61&2 ■ • • we call &162 ■ • ■ < n) a Bprefix of the Bstring. Note 

that C1C2 • • • C2h_i is a Bstring of length h. For two Bcharacters bi and 62, we 
let bi > 62 if bi is lexicographically larger than &2 in the normal string repre- 
sentation. Note that Bcharacter b = C2i ■ ■ ■ 22i+i_i can be represented by node 
P2i £ P and integer i. 

Consider a binary forest U of size n whose nodes are labeled with a character 
of an integer alphabet { 1 , . . . , n}. Let ui, U2, . . . , be the concatenated list of 
the breadth-first-ordered node lists of all the binary trees in forest U, and let 
Qi £ { 1 , . . . ,n} be the label of node Vi. Let Li be the the label of the largest 
completely balanced binary subtree of U whose root is node Vi. We call Li 
followed by $i ^ { 1 , . . . ,n} ($i yf $j) the label of node Vi. If the roots of two 
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(1) Binary tree U 



(2) Bsuffix tree for U 



Fig. 2. An example of the BsnfBx tree. 



completely balanced binary subtrees P\ and P 2 of U are the same node and P\ 
includes P 2 , the label of P 2 is a Bprefix of the label of P\ . The Bsuffix tree of U 
is the compacted trie T of the labels of all the nodes in U in the Bstring sense, 
he., the outgoing edges from some node in the suffix tree have a label of different 
Bcharacter. Figure E] shows an example of a Bsuffix tree. By using T, we can 
very easily query any completely balanced binary subtree of U. 

Edge labels of T can be represented by the first node in T and the depths of 
the first and the last nodes in the pattern. Therefore T can be stored in 0(n) 
space. Note that we can access any member of the edge label of T in a constant 
time if we have both the breadth-first list and the depth-first list of the nodes 
of each tree in forest U. 



4.2 Construction of the BSufRx Tree 

In this subsection, we describe the 0(n) algorithm for constructing the Bsuffix 
tree T oiU . 

If forest U consists of only nodes with less than two children, it is obvious 
that we can construct the Bsuffix tree of U in 0{n) time. Otherwise, we first 
construct a new binary forest U' as follows: For every node Vi with two children 
Vj,Vj+i, construct a node of U' (let it be Wi). If Vj and/or Vj+i have two children, 
let Wi be the parent of Wj and/or Wj+i in forest U' . Radix sort the label pairs 
(oj, Oj+i) and remove duplicates. Let the label a' of Wi be the rank of the label 
pair (oi,ai+i) in the sorted list. Notice that the number of nodes in U' is not 
larger than n/2. We construct the Bsuffix tree T' of U' by using our algorithm 
recursively. Figure |3] shows an example of this recursive construction of new 
binary forests (trees in this case). Next, we construct T from T' . 

If we are given the lexicographically sorted list of all the node labels of U 
and the length (he., number of Bcharacters) of the longest common Bprefix of 
adjacent Bstring labels in this list, we can construct Bsuffix tree T in a linear 
time. We obtain these two pieces of information from T' . 

Notice that the in-order traverse of leaves of T' is also a lexicographically 
sorted list of all the first-character-deleted labels of nodes that have two children 
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Fig. 3. Recursive construction of new binary trees in computing Bsuffix tree 



in T. Thus we can obtain the lexicographically sorted list of all the node labels 
of U by radix sorting the concatenated list of the in-order traverse of leaves of 
T' and the labels of nodes with no or only one child. 

The longest common Bprefix length I of adjacent labels can also be obtained 
from T'. If the first characters of two adjacent labels are different, 1 = 0. Other- 
wise, if one of the adjacent labels consists of only one character, the depth is 
I = 1. Otherwise, we compute the depth as follows. Let Vi and Vj be the adjacent 
nodes. Notice that we can obtain the longest common Bprefix length I' of labels 
of Wi and Wj in U' in a constant time (see Theorem |5| . Then it is clear that 
l = l'+l. 

In this way we can construct T from T' in a linear time. We obtain T(n) = 
T{n/2) + 0{n), where T{n) denotes the time taken to compute the Bsuffix tree 
of a binary tree of size n. Therefore we conclude that our algorithm runs in 0(n) 
time. 



4.3 Discussions on the BsufRx Tree 

Bsuffix trees are very similar to normal suffix trees. It enables 0(m log m) query 
for a completely balanced binary tree pattern of size m. It can also be used for 
finding (largest) common completely balanced binary subtrees of two binary trees 
in linear time. We can also enumerate frequent patters of completely balanced 
binary trees in linear time by using this data structure. 

The data structure and our algorithm assume that the labels are given to 
nodes, but they can very easily be modified to deal with edge-matching problems 
as follows: Let the label of any node except for the root be the label of the 
incoming edge from its parent. Then T' in the above algorithm can be used as 
the compacted trie for edge matching. 

Bsuffix trees can also be used for querying completely balanced fc-ary trees 
from any fc-ary forest U. First, if a node has less than fc children, remove the 
edges between it and its children. Otherwise, we reconstruct each node that has 
fc children as a completely balanced binary tree of depth [log 2 fc] and move each 
child to its leaf. For each inside node and leaf to which no node was mapped, 
give as its label a new character that is not in use. Notice that the size of the 
reconstructed forest is at most twice as that of the original one. Then construct 
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the Bsuffix tree for this reconstructed binary tree. It can obviously used for 
querying completely balanced fc-ary trees. 



5 Concluding Remarks 

We have described an O(nloglogn) algorithm for constructing the suffix tree of 
a common suffix tree (CS-tree). For trees called shallow fc-ary trees, we also de- 
scribed an 0(n) algorithm. In addition, we proposed a new data structure called 
a Bsuffix tree, that enables efficient query for completely balanced subtrees. 

The existence of a linear time algorithm for constructing the suffix tree of 
any trees for large alphabets remains as an open question, as does the existence 
of more useful suffix trees that allow querying more general and flexible patterns 
than paths or completely balanced trees. 
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Appendix: Proofs of Theorems [3] and |4] 

Proof of Theorem [3] 

We achieve an O(loglogn) computing time by means of the following algorithm. 

Let m be the number of leaves in the target tree T. We call a path from some 
node to some leaf a ‘run’. We first divide all nodes in the tree into m runs, as 
follows: 

1. Index the leaves by in-order traversing of T, and let them be hj ^ 2 ; ■ • ■ dm- 

2. Consider a completely balanced binary tree B of height |"log 2 (m -I- 1)] . Let 
ui, U 2 , . . . , Vm' be the nodes of B in the breadth- first order, where m < m! = 
2 riog 2 ("i+i)l — 1 < 2m. Map the leaves Zi, ^ 2 , ■ • • dm to the in-order traverse 
of the nodes in B. Let lb. be the leaf of T to be mapped to Vi in B. 

3. For i = 1,2,..., m', construct runs rurii,run 2 , ■ ■ ■ , ruum' (note that m' — m 
of these are empty runs) as follows: 

— If Ib^ does not exist, let rurii = </> (an empty run) and continue. 

— Find the maximum run rurii that does not contain any node of runs 
{rurijlj < i} and ends at leaf Z;,. . Note that unless rurii starts from the 
root of T, the parent node of the first node (nearest to the root) of rurii 
must belongs to some run rurip^. We call rurip^ the parent run of rurii. 
Note that we can construct a tree R of runs by using this parent-child 
relationship. 

— Let ri = 1. If i > 1, compute the following r^: Let r be a binary number 
with only one 1-bit that is at the same position as the leftmost 1-bit of i, 
i.e., let r = 2 L*°S 2 d. Then let rj = rp. V r, where V denotes bitwise OR. 
Note that the depth of Ui in B is 1 -I- [log 2 r-jj = 1 -|- log 2 r = 1 -|- [log 2 i\ . 

Figure IHshows an example of T, B, and R. Note that ri is displayed in a binary 
number in Figure |4] (3). 

Note that the parent of node Vi in B is r'[i/ 2 j • Thus for any node of depth d 
in B and some integer d' such that 0 < d' < d, we can access the node’s ancestor 
of depth cZ in a constant time by simply right-shifting its index by cZ — cZ' bits. 
For any run and some integer d, it is clear that the node of depth d in the run 
can be accessed in a constant time if each run manages its nodes. Thus, once we 




(1) T 



(2) B 



(3) R 



Fig. 4. Example of T, B, and R. 
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find the run that contains the target ancestor, we can find it in a constant time. 
We now discuss how to find the run. 

Consider node w in T and let rurii be the run that contains w. It is obvious 
that any ancestor of rc in T is contained by one of the ancestor runs of rurii in 
R. Furthermore, if runj is an ancestor of rurii in R, then vj is also an ancestor 
of Vi in B. Let runa^,ruria 2 , ■ ■ ■ -ifuria^, (oi = 1 < 02 < • • • < Ufc = i) be the 
ancestor runs of rurii (including itself). Note that the binary number contains 
k 1-bits. 

For any j such that 0 < j < fc, we can access ruria^ in a constant time by 
using the value of as follows: Let r' be a binary number that has only one 
1-bit whose position is the same as the jth-rightmost 1-bit of binary number r^. 
Let r" be a binary number that has only one 1-bit whose position is the same 
as the leftmost 1-bit of binary number r^. Then Uj = \ i ■ {r' /r")\ (right-shift by 
log 2 r" — log 2 r' bits). Using this constant-time access to the ancestor runs, we 
can search the target run in O(logfc) time by checking the depths of the first 
nodes of ancestor runs. Hence we conclude that the time for finding the target 
node is O(loglogn), because k < l-l-log 2 m' < 2-|-log2 m. Note that it is obvious 
that all of the data structures used above can be constructed in a total of 0{n) 
time. 



Proof of Theorem |4] 

This case is far simpler than that of Theorem 0 For a completely balanced binary 
tree, we can find the ancestor of depth d in a constant time by indexing nodes 
in breadth-first order and shifting the bits of indices. The case of shallow binary 
trees is also obvious: We can consider a minimum complete balanced binary tree 
that contains a shallow binary tree of size n as its subgraph. Its size is 0{n), and 
it can be built in 0{n) time. We can find the target ancestor in the new tree in 
a constant time. 

In a general shallow fc-ary tree where k is some constant, every node can be 
mapped to an 0{n) binary shallow tree in such way that the depth of a mapped 
node is a constant times as the depth of the original node in the original tree. 
In this way, we conclude that such an ancestor can be found in a constant time 
after linear-time preprocessing. 
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Abstract. We design an algorithm that generates multiset permutati- 
ons in 0(1) time from permutation to permutations, using only data 
structures of arrays. The previous 0(1) time algorithm used pointers, 
causing 0(n) time to access an element in a permutation, where n is the 
size of permutations. The central idea in our algorithm is tree traversal. 
We associate permutations with the leaves of a tree. By traversing this 
tree, going up and down and making changes when necessary, we spend 
0(1) time from permutation to permutation. Permutations are generated 
in a one-dimensional array. 



1 Introduction 

Algorithms for generating combinatorial objects, such as (multiset) permutati- 
ons, (multiset) combinations, well-formed parenthesis strings are a well studied 
area and many results are documented in Nijenhuis and Wilf [^, and Reingold, 
Nievergelt, and Deo [H|, etc. 

Let n be the size of the objects to be generated. The most primitive algo- 
rithms are recursive ones for generating those objects in lexicographic order, 
causing 0{n) changes from object to object, and thus 0{n) time. To overcome 
this drawback, many algorithms were invented, which generate objects with a 
constant number of changes, 0(1) changes, from object to object. This idea 
of generating combinatorial objects with 0(1) changes is named ’’combinatorial 
Gray codes” , and a good survey is given in nn. In many cases, these changes are 
made by swappings of two elements, that is, two changes. It is still easy to design 
recursive algorithms for combinatorial generation with 0(1) changes, since we 
can control the paths of the tree of recursive calls and thus we can rather ea- 
sily identify changing places. Note that combinatorial objects correspond to the 
leaves of the tree, meaning that it takes 0(n) time from object to object as the 
height of the tree is n. Further to overcome this shortcoming, several attempts 
were made to design iterative algorithms, which are called loopless algorithms 
in some Itera iterature, removing recursion, so that 0(1) time is achieved from 
object to object. At this stage, we need some care in defining the 0(1) time from 
object to object. In Korsh and Lipschutz [^, 0(1) time was achieved to generate 
multiset permutations, whose algorithm is a refinement of that by Hu and Tien 
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jl]. In this algorithm, multiset permutations are given one after another in a lin- 
ked list. The operations on the list are manipulated by pointers, involving shift 
operations in 0(1) time. For example, the list (1, 1, 1, 2, 2, 2) with n = 6 can be 
converted to (2, 2, 2, 1, 1, 1) in 0(1) time by changing pointers. We assume that 
the above conversion takes 0(n) time in this paper, and we claim that multiset 
permutations can be generated in 0(1) time using arrays, not pointers. 

This kind of strict requirement for 0(1) time was demonstrated in the re- 
cent development in parenthesis strings generation. An 0(1) change algorithm 
was developed in Ruskey and Proskurowski |10j and an 0(1) time algorithm 
with pointer structures was achieved in Roelants van Baronaigien [^, and they 
challenged the readers, asking whether there could be 0(1) algorithms with ar- 
rays, whereby stricter 0(1) time could be achieved. This problem was recently 
solved by three independent works of Mikawa and Takaoka [5], Vajnowski [13], 
and Walsh [14j. Note that we can access any element of a combinatorial object 
in 0(1) time in array implementation, whereas we need 0(n) time in linked list 
implementation, as we must traverse the pointer structure. The algorithm by Ko 
and Ruskey p] generates multiset permutations with swappings of two elements, 
but not with 0(1) time from permutation to permutation. 

The main idea of 0(1) time for multiset permutation generation in this paper 
is tree traversal. The generation tree for a set of permutations, arranged in some 
order, on the given multiset is a tree whose paths to the leaves correspond to the 
permutations. Basically we traverse the tree in movements of (up, cross, down). 
The move ”up” is to go up the tree from a node to one of its ancestors. The 
move ’’cross” is to move from a node to its adjacent sibling, causing a swapping 
with the element at that level and the one at a level closer to the leaf. The move 
’’down” is to go down from a node to one of its descendants, which we call the 
landing point. The landing point has no sibling and the path to the leaf has no 
branching, causing a straight line. It is important that we avoid traversing this 
straight line node by node. The core part of the algorithm is centered on how to 
compute the positions to which we go up and down, and where we should perform 
swappings. Although the use of tree structure for combinatorial generation was 
originated in Lucas [I] and Zerling m, and well known, the technique of tree 
traversal in this paper is new. 

Since the final algorithm is rather complicated, we go through a stepwise 
refinement process, going from simple structures to details. In Section 2, we 
define the generation tree and design a recursive algorithm that traverses this 
tree to generate multiset permutations. We give a formal proof of the recursive 
algorithm. In Section 3, we design an iterative algorithm based on the recursive 
algorithm. We first describe an informal framework for an iterative algorithm, 
and translate the recursive algorithm into an iterative one guided by the fra- 
mework. The resulting iterative algorithm generates multiset permutations in 
0(1) time in a one-dimensional array. As additional data structures, we use a 
few more arrays, causing 0{kn) space requirement, where k is the number of 
distinct elements in the multiset. In Section 4, we give concluding remarks. 
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Fig. 1. Generation tree for permutations on [1,2, 2, 3, 3] 
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2 Permutation Tree and Recursive Algorithm 

We denote a multiset by [...] and ordinary set by {...}• Those notations identify 
operations such as set union and set subtraction when the same symbols are used 
on sets and multisets. We convert a multi-set S to the set set{S) by removing 
repetition of each element. If S' = [1,1,2], for example, set{S) = {1,2}. Let a 
multiset S = [1, ..., 1, 2, ..., 2, ..., k , ..., k] be defined by {mi, m2, ■■■, mk), where mi 
is the multiplicity of i. Let P be a set of all multiset permutations on S arranged 
in some order. Since S is the base multiset for P, we use the notation base{P) 
= S. We use word “permutation” for “multiset permutation” for simplicity. Let 
N = n!/(mi!...TOfc!). Then we have jPj = N. Let x € P be a permutation given 
by X = aia 2 ...a„. We construct the permutation tree of P, T{P), in such a way 
that each x G p is associated with a path from the root to a leaf. Since the path 
from the root to a leaf is unique in a tree, x will also correspond to the leaf at 
the end of the path. If x' is the next permutation of x in P, we correspond x' to 
the next leaf of that for x. Let x' be given by x' = a\...aia'ij^i...a'^. That is, x' 
shares some prefix (possibly empty) with x. Then the paths to the two adjacent 
leaves x and x' share the path corresponding to ai...ai. 

Example 1. Let S be given by {mi, m2, m3) = (1,2,2). We give P and T{P) in 
the previous page. 

In this example, we assume we give permutations in P in this order. The 
number shown by {i) to the right side of each permutation is to indicate the ith 
permutation. This list of permutations also gives the shape of the tree T{P). The 
root at level 0 has three branches leading to sibling nodes at level 1 with labels 
1, 2, and 3. Then the node at level 1 with label 1 has two branches leading to 
sibling nodes with labels 2 and 3, etc. We have 5!/(l!2!2!) = 30 members in P. 
We draw the tree horizontally, rather than vertically, for notational convenience. 

We use a list nodes[i] of elements from the set set{S) of a multiset S to keep 
track of siblings at level i, We define two types of operation with notation 4=. 
Operation t 4= nodes[i] means that the first element of nodes[i] is moved to a 
single variable t. Operation nodes[i] 4= t means that t is appended to the end of 
nodes[i]. The history of variable t keeps track of all elements in nodes[i\. For a 
list L, set{L) is the set made of elements taken from L. Next{L) is the second 
element of L. We identify nodes of the tree by array elements of a whenever clear 
from context. A recursive algorithm is given below. 

Algorithm 1 Recursive algorithm 

1 . procedure generate{i) ; 

2. var t, s; 

3. begin 

4- nodes[i] := (a[t]); 

5. if f < n then 

6. repeat 

7. generate{i + \); 
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8. Let s he the leftmost position of next{nodes[i]) in a such that i < s 

9. if a[i] is not a last child then stuap(a[t], a[s]); 

10. if a[i — 1] zs o first child then 

11. if a[i] ^ a[i — 1] then node.s[i — 1] 4= a[i\; 

12. t 4= nodes[i\ 

13. until nodes[i\ = 0 

14. end; 

15. begin { main program } 

16. Let a = [1, 1, 2, 2, A:, fc]; 

17. qenerate(l) 

18. end. 



Let tail{a) be the consecutive portion of the tail part of a such that all elements 
in tail{a) are equal to a[n]. Let Q be a set of all permutations generated from 
S — [oi, ...,Oi]. Then the notation a\...aiQ means the set of permutations made 
by concatenating with all members of Q. We use notations Oi and a[i] 

interchangeably to denote the z-th element of array a. We state the following 
obvious lemmas. 

Lemma 1. Let P be the set of permutations on the multiset S of size n and 
first{P) = {xi\xiX2...Xn G P}. Then first{P) = set{S). 

Lemma 2. Let S he a multiset and P he the set of permutations on S. Then 
P = biQiLI ...UbiQi, where set{S) = {hi,..., hi} andQj is the set of permutations 
on the multiset S — [bj]. 

Theorem 1. Algorithm 1 generates all permutations on S by swaooing. 

Proof. We show by backward induction that generate{i) generates the set P 
of all permutations on [a[z], ..., a[n]]. The case of z = n is obvious. Suppose the 
theorem is true for z + 1. Then observe that the first call of generateii + 1) 
in generate{i) will generate all permutations on [a[z + 1], ...,a[n]] by induction, 
which we denote by Q. ;,From Lemma 1, it holds that first{Q) = set{a[i + 
1], ...,a[rz]). ;,From lines 10-11 of the program we have set{nodes[i\) = {a[z]} U 
first{Q) = set[a[i], ...,a[rz]] immediately after the first call of generate{i + 1). 
Let set[a[i], ..., a[n]] = {bi, ..., bi} for some I such that bi = a[i] at the beginning 
of generate{i). Then we are generating a[l]...a[z — l]bjQj for j = 1, ...,l, where 
base{Qj) = base{Qj-i) — [bj] U \bj-i] for j > 1, and Q\ = Q. Since we swap a[i] 
and a[s] at the end of each call of generate{i + 1), I different multisets are given 
in (a[z -I- 1], ..., , a[n]) as base{Qj) before calls of generate{i + 1). From Lemma 
2, we conclude that the set P is generated by calling generatefi + 1) with all bj 
given in t. 

Example 2. Let z = 1, and suppose we start from a = [1, 2, 2, 3, 3]. Then we have 
base{Q) = [2, 2,3,3], and first{Q) = {2,3}. Since these elements are appended 
to nodes[i] = (1), we have nodes{l) = (1,2,3), which forms the first(P), where 
P is the entire set of permutations. 
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Note that the choice of position s at line 9 can be arbitrary as long as we 
choose a position s such that next[nodes[i]] = a[s] and i < s. Two consecutive 
permutations before and after swap are different only at i and s such that i < s. 
In this context, we say i is the difference point and s is the solution point. In 
Example 1, the permutations on [1, 2, 2, 3, 3] are generated by this algorithm. 



3 0(1) Implementation 

Algorithm 1 takes 0(n) time from permutation to permutation due to its recur- 
sive structure. In this section we avoid this 0{n) overhead time for traversing 
the tree. By using some data structures, we jump from node to node in the 
permutation tree. 

When we first call generate(l), it will go down to level n and come back 
to level i = n — tail{a) + 1 without doing any substantial work, since all nodes 
on this path are last children. At this level the algorithm append a[i] = k to 
nodes[i — 1] and comes to level i = n — tail{a), that is, i is decreased by 1. Then 
it swaps a[i] and a[i + 1], add new a[i] to next[i — 1], and go down to level n. 
When the algorithm traverses the tree downwards and upwards, there are many 
steps that can be avoided. Specifically we can start from level i = n — tail{a) + 1. 
After we perform swapping, we can come down straight to level i = n—tail{a) + l 
with the new tail{a). We keep two arrays up and down to navigate our traversal 
in the tree; up[i] tells where to go up from level i and down[i] tells where to go 
down from level i. Level up[i] is the level where we hit a non-last child when we 
traverse the tree from level i. The formal definition of down[i] is given later. 

When we perform swap(a[i], a[s]), we need the information of s at hand 
without computing the leftmost position of next{nodes[i]) to the right of i. 
Obtaining this information for level up[i] is carried out when we come to a last 
child at level i by updating s[t6p[z]] by i if a[i] = next{nodes[up[i]]) for the first 
time. For this purpose, the variable s is given by array s to keep the information 
of s for each level. 

Example 3. In Figure 1, we can start from the point Start. Suppose we reached 
the point A after several steps. We have up[A\ = 2, which we inherited from 
up[A\. Since next{nodes[2]) = 1, we set s[mp[ 4]] = 4. We make transition A —>■ 
B ^ C ^ D. When we cross from B to C, we swap a [2] and a [4] and come to 
the landing point D. 

We translate Algorithm 1 into the following informal iterative algorithm for tra- 
versing the tree, resulting subsequently in Algorithm 3. 



Algorithm 2 Informal iterative tree traversal 

initialize a to he the first permutation on S; 
initialize up[i] and down[i] to i for i = 0 , ..., n; 
initialize nodes[i] to (a[z]) for z = 1, ...,rz; 
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i := n — \tail{a) \ + 1; 

repeat 

if nodes[i — 1] has not been updated by a[i — 1] ’s ehildren then update it; 
output(a); 

if a[i] is not a last child then st/;ap(a[t], a[s[i]]); {action cross} 
update nodes[i — 1]; 
if a[t] is a last child then begin 
up[i] := up[i — 1]; up[i — 1] \= i — 1; 
update 

update down[up[i]]; 

if i = n — \tail{a)\ + 1 {a[t], a[n] form a straight line} 
then i := up[i]; {going up} 
else i := down[i]; {going down} 

end 

else {a[i] is not a last child} 
i := down[i] {going down} 
until i = 0 {root level}. 



As we cross from a node to the next, swapping two array elements, tail{a) 
grows or shrinks. For the computation of tail{a), which, in turn, gives the infor- 
mation of down, we use array run. Array run is to keep track of the length of 
consecutive array elements that are equal to a[i] when we traverse the path of 
last children starting at Array run is computed by increasing run[up[i\\ 

by 1 when we hit o[up[i]] = a[i] on the path, and reset to 0 otherwise. These 
values of run are used to compute tail{a) after we perform the swap ope- 
ration, whereby we can compute the values of down. Specifically we can set 
down[up[i]] := i — run[up[i]] if a[up[z]] = a[n] and down[i] = i + 1. Note that 
down[i] =1-1-1 means a[i -I- 1] = ... = a[n], since the landing point is the left 
end of tail{a). In other cases, down[up[i]] is set to z -I- 1 or z depending on the 
situation at level z, as described in the comments of Algorithm 3. 

The Boolean value of mark[i] = true is to show that the value of down[i] has 
been set and prevent further modification. 

If we hit a non-last child we always go down guided by dozen [z]. If we hit a 
last child, we may go down or go up, if z < down[i] or z = down[i] respectively. 

When we go up to the ancestor, the path to the node on which we stand 
consists of last children. We call this path the current path. When we go down 
from a node to a descendant, the path from the node to the descendant consists 
of first children. We call this path the opposite path. Most of the work in the 
algorithm is to prepare the necessary environment for the opposite path when 
we are traversing the current path. We jump over the opposite path from the left 
end to the landing point, whereas we traverse the current path node by node. 
Whenever we come to a node, the necessary information for the next action must 
be ready. 

Example 4- In Fig. 2, run{up[i]) = 2 for two 6’s between d and c on the current 
path. We swap b at level up[i] and c at level z and go down to level down[up[i}\, 
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that is, the leftmost position of tail{a) on the opposite path, which consists of 
five b's. 



up\i] down[up[i]] i 

b 




Fig. 2. Illustration of run 



We leave the details of implementation including the data structure nodes[i] 
in a full Pascal program at “http://www.cosc. canterbury. ac.nz/~tad/perm.p”. 

Algorithm 3 Iterative algorithm for multiset permutations {with comments}. 

1 . a:= [l,...,l,2,...,2,...,fc,...,A:]; 

2. for f := 1 to n do begin nodes[i] := (a[i]); s[i] := 0; 

mark[i] := false; up[i] := i end; 

3. i := n — m[k] + 1; 

4- up[tf\ := 0; 

5. repeat 

6. if a[i — 1] is a first child and nodes[i — l] has not been updated by its children 

I. then if a[i] yf a[i — 1] then nodes[i — 1] 4= a[i\; 

8. if |no(ies[t]| > 1 then begin {current node a[i] is not a last child} 

9. swap{a[i],a[s[i]]); {crossing} 

10. mark[i] := false; {this shows down[i] for level i needs to be updated 

for later use} 

II. nodes[s[i]] := (a[s[f]]); {prepare nodes for the solution point} 

12. remove first of nodes[i\; 

13. if a[i — 1] is a first child then 

14 . if a[i] yf a[i — 1] then nodes[i — 1] 4= a[i]; {update nodes[i — 1]} 

15. s[z] := 0 {solution point for level i is to be set} 

16. end; 
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n. run[i] := 0; {initialize run[i]} 

18. if a[i] is a last child then begin 

19. up[i] := up[i — 1]; {up propagates} up[i — 1] := i — 1; {up[i — 1] is reset} 

20. if t < n then 

21. if a[up[z]] = a[i] and up[i] < i then 

22. run[up[i]] := run[up[i]] + 1 {extend run for level up[i]} 

23. else run[up[i\\ := 0;{reset run} 

24 . if a[i] = next{nodes[up[i]]) then begin {do the following for up[z]} 

25. if = 0 begin 

26. ii i = down[i] — 1 and a[wp[t]] = a[n] then begin 

21. down[up[i]] := i — run[up[i]]; {compute down for 

28. mark[up[i]] := true; {down for up[i] has been finalized} 

29. end else down[up[i\\ := i + 1; {down[up[i]] at least t + 1} 

30. s[ttp[z]] := i {solution point for up[i] is i} 

31. end 

32. else begin {s[mp[z]] ^ 0} 

33. nodes[i] := (a[z]); {prepare nodes for the opposite path} 

34 . if mark[up[i]] = false then down[up[i]] := i {update down} 

35. end 

36. else begin {a[i] ^ next{nodes[up[i]]} 

31. nodes[i] := (a[z]); {similar to 33} 

38. if mark[up[i]] = false then down[up[i]] := i {similar to 34} 

39. end; 

40 . if z < down[i] then begin {going down} 

41 . mark[i] := false; {set mark to false for the opposite path} 

42 . i := down[i]; 

43 . nodes[i] := (a[z]); {initialize nodes} 

44- end else 

45 . begin {going up} 

46 . zl := z; 

41 . i '.= up[i]; 

48 . up[il] := zl; {reset up for the old z} 

49 . end 

50. end else 

51. begin }a[z] is not a last child, going down} 

52. i := down[i]; 

53. nodes[i] := (a[z]); {similar to 43} 

54 . end; 

55. until z = 0. 



4 Concluding Remarks 

We developed an 0(1) time algorithm for generating multiset permutations. 
The main idea is tree traversal and identification of swapping positions. This 
technique is general enough to solve other combinatorial generation problems. In 
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fact, this technique stemmed from that used in generation of parenthesis strings 
in 1^. The author succeeded in designing 0(1) time generation algorithms for 
other combinatorial objects, such as in-place combinations, reported in m- 

The key point is the computation of up, down, and s, the solution point, in 
which up is very much standard in almost all kinds of combinatorial objects. If 
we always go down to leaves, we need not worry about down. This happens with 
more regular structures, such as binary reflected Gray codes, ordinary permu- 
tations, and parenthesis strings, where we can concentrate on the computation 
of s. Multiset combinations and permutations have more irregular structures, 
that is, straight lines at some places, which require the computation of down, 
in addition to that of s. There are still many kinds of combinatorial objects, for 
which only 0(1) change algorithms are known. The present technique will bring 
about 0(1) time algorithms for those objects. 

The space requirement for the algorithm is 0{kn). It is open whether this 
can be optimized to 0(n). 
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Abstract. Given a boolean CNF formula F of length |_F| (sum of the 
number of variables in each clause) with m clauses on n variables, we 
prove the following results. 

— The MAXSAT problem, which asks for an assignment satisfying the 
maximum number of clauses of F, can be solved in 0(1.341294’"|F’|) 
time. 

— The parameterized version of the problem, that is determining 
whether there exists an assignment satisfying at least k clauses of the 
formula (for some integer k), can be solved in O(fc^l.380278*^ + IT’D 
time. 

- MAXSAT can be solved in O(1.105729'^l |F|) time. 

These bounds improve the recent bounds of respectively 0(1.3972™'|T|), 
0(fc^l.3995*^ + \F\) and 0(1.1279^^'|F’|) due to Niedermeier and Ros- 
smanith El for these problems. Our last bound comes quite close to the 
O(1.07578'^'|F’|) bound of Hirsch|^ for the Satishability problem (not 
MAXSAT). 

1 Introduction 

There are several approaches to deal with an NP-complete problem. A popular 
approach is to settle for a polynomial time approximate solution with provable 
guarantee on the quality of the solution. Another well studied approach is to ex- 
plore some structure in the problem to design efficient algorithms (for example 
designing efficient algorithms for NP-complete graph problems in special clas- 
ses of graphs). Another recent approach for the parameterized versions of the 
NP-complete problems is to look for fixed parameter tractable algorithms [5]. 
Despite all these approaches, often many NP-hard problems have to be solved 
exactly, in practice. Due to this reason, considerable attention has been paid 
to designing efficient (better than the naive 2"’) algorithms for several NP-hard 
problems [nminininizisEiis] • Satisfiability, Vertex Cover, Independent Set are 
some of the problems for which such efficient exact algorithms are known. In this 
paper we develop efficient exact algorithms for another fundamental NP-hard 
problem MAXSAT (Maximum Satisfiability). Despite considerable advances in 

* The work was done while the first author was at IIT Mumbai and visited IMSc 
Chennai as a summer student. 
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the study of exact algorithms for the Satisfiability problem, desigining efficient 
exact algorithms for MAXSAT has started receiving the attention of researchers 
only recently. 

For the rest of the paper we will be dealing with boolean CNF formulae on 
n variables and m clauses. For a formula F, |F| will denote the length of the 
formula which is the sum of the number of variables in each clause. 

Raman, Ravikumar and Srinivasa Rao [14] showed that MAXSAT problem 
remained NP-hard even if every variable appears at most three times. They gave 
an 0(\/3 |F|) algorithm for this version of MAXSAT problem. For the general 
MAXSAT, Mahajan and Raman |H] gave an 0{\F\(f)^) algorithm, where (j) = 
1.6181... is the golden ratio. This was improved to 0(|F|1.3995"*) by Niedermeier 
and Rossmanith m- In this paper, we improve this to 0(|F|1. 341294™). For 
the natural parameterized question, whether at least k clauses of the formula be 
satisfiable, Niedermeier and Rossmanith m improved the algorithm of Mahajan 
and Raman |H] to obtain a bound of 0(fc^l.3995^ + |F|). In this paper we improve 
this to 0(fc^l. 380278* + |F|) time. 



In the development of efficient exact algorithms for SAT, bounds where the 
exponent is the length of the formula have been obtained (see, for example, 
EEI). We also design algorithms for MAXSAT along similar lines to obtain 
a bound of 0(1.1057291'*’! |F|) for MAXSAT improving upon the recent bound 
of 0 ( 1 . 1279 !*’! I F|) due to Niedermeier and Rossmanith [TT] . This bound im- 
plies that if every variable occurs at most 6 times, then we have a ((2 — e)"|F|) 
algorithm for some positive e < 1. Our bound also comes quite close to the 
0(1.07578!'*’! |F|) bound of Hirsch]^ for the Satisfiability problem (not MAX- 
SAT). 

En route to our main algorithms proving the above bounds, we develop an 
0(1.324719"|F|) algorithm for the case when every variable in the formula ap- 
pears at most three times. (Due to page limitations, this algorithm appears only 
in the technical report version]^ of the paper.) This bound significantly impro- 
ves the bound of 0('\/3 |F|) due to Raman, Ravikumar and Srinivasa Rao |14] 
mentioned above. This improved algorithm for the special case is also used later 
in our main algorithms. 

We remark that all our algorithms employ some simplification rules as well 
as the standard Davis-Putnam type branching rules, and involve an extensive 
case analysis as is typically the case with most of the exact algorithms refered 
earlier [111019171311 1| . Further we emphasize that throughout the development of 
algorithms for these problems, even small improvements in the second or even 
the third digit after the decimal in the exponent can mean significant progress. 

In the next section, we give some notations and definitions used in our al- 
gorithms. Section 3 gives some simplification or reduction rules. Section 4 gives 
some branching rules which will be later employed in the main algorithms. In 
Section 5, two new algorithms are proposed for the general MAXSAT problem. 
The exponent in the time bound of the first one is the number of clauses and 
that of the second one is the length of the formula. 
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2 Notation 

Let I be a literal in a formula F. We will call it an {i, j)[pi,P2, ■ ■ ■ ,Pi][ni,n2, ■ • ■ , 
Tij] literal if it occurs i times positively and j times negatively and the clauses 
containing I are of length pi < P2 ^ Pi and those containing I are of 

length rii < U2 <■■■< Tij . If the length of the clauses is not important we will 
simply call it an (i,j) literal. If we say that some variable or literal occurs 
times, we will mean that it occurs at least i times. Similarly i~ denotes at most 
i occurrences. For a variable x, we say x occurs in a clause C if x € C or x G C. 
We call X a /c-variable, if x is some (i,j) literal such that i + j = k. 

If a literal x occurs as a unit clause {x}, k times in the formula, we will 
denote the fact by n„(x) = k. For a literal x, l{x) will denote the sum of the 
lengths of the clauses containing x. 

If F’ is a formula, F[pi, . . . ,pk][ni , . . . , n;] will denote the formula obtained 
by putting the literals pi = ■ ■ ■ = Pk = true and rii = ■ ■ ■ = ni = false. If C is 
a clause then F’[][C] will denote the formula obtained by putting all the literals 
in C as false. 

Our algorithms for MAXSAT first go through some simplification and re- 
duction rules where the given formula is simplified. On the reduced formula, we 
then apply some branching rules. If a branching rule for the formula F bran- 
ches as F[Ai][Fi], F’[A2][l2] . . . , F[Xi][Yi\ it means that each of the formulae 
F[Xj\\Yj], j = 1 to t is solved (recursively or otherwise) and the assignment that 
satisfies the maximum number (or the required number, in the parameterized 
problem) of clauses among them is returned. The branching rule ensures that 
the returned assignment actually satisfies the maximum (or the required) num- 
ber of clauses of F. If the question to be answered is a decision question (like 
“can k clauses be satisfied?”), then the answer returned is ‘yes’ if any of the 
branches returns ‘yes’ and ‘no‘ otherwise. If the “size” of F is g and the sizes of 
F[Aj][Yj] < q — rj,\fj, where “size” is the number of variables, the number of 
clauses or the number of clauses to be satisfied, as the case may be, then such a 
branching rule is said to have a branching vector (ri, r2, . . . , Xj). 

If two clauses Ci and C2 contain some pair of complementary variables, we 
will denote the fact by Co{Ci,C2). A subset of clauses is said to be closed if 
no variable in this subset occurs outside this subset in the rest of the formula. 
For a clause C, C — {oi, 02, . . . , a^} will denote the clause obtained by deleting 
all instances of the literals Oi, 02, . . . , from the clause C. We will denote the 
number of clauses in a formula F, by m{F). The maximum number of clauses 
that are satisfied by an optimal assignment will be denoted by opt{F) . 

3 Simplification Rules 

We list a set of rules to reduce a given CNF formula so that solving the MAXSAT 
problem for the original formula is equivalent to solving it for the reduced one. 
First we assume that the formula has no empty clause (which is vacuously false), 
and that every variable appears both positively and negatively - i.e. there are 
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no (i, 0) or (0, i) variables as they can be eliminated easily by assigning them an 
appropriate value. 

1. Elimination of (1, 1) literals: If a is a (1, 1) literal, and a G Ci and d G C 2 , 
then if Co(Ci — {a},C 2 — {d}) just remove the clauses. Otherwise, replace 
them by a single clause {Ci U C 2 — {a, d}}. 

2. Replacement of almost common clauses: If 3 clauses Ci and C 2 and a 
literal a, such that Ci — {o}=C 2 — {d}, then replace Ci and C 2 by the clause 
{CiUC2-{a,d}}. 

In particular this implies that we cannot have {a} and {a} together as unit 
clauses in the same formula. 

In both these rules, if F' is obtained from F by application of the rule, clearly 
opt{F) — opt(F') = m{F) — m{F') and this justifies the rules. 

Note that each simplification rule can be applied for a formula F in 0(|E|) 
time. Further after all simplification rules are applied, every (remaining) variable 
appears both positively and negatively and each variable appears at least three 
times in the formula. Also if a variable appears as a unit clause, all its occurrences 
in the unit clauses are either all pure or all negated. 

4 Branching Rules 

In this section, we give some rules to abandon certain branches of the davis- 
putnam type branching algorithm. The reason for abandoning them (which we 
may not specifically mention in some rules due to page limitation) is that in one 
of the branches followed, the number of clauses satisfied is at least as many as 
the maximum number of clauses satisfied in any of the branches abandoned. 

1. If a; is a {k, 1) literal, and n„(ai) > k, then branch as E[][a;]. 

2. If a; is a (fc, 1) literal, and n„(a;) = k — 1, let Ci, . . . ,Ck denote the clauses 
containing x. If 3 i,j such that Co{Ci,Cj), then branch as E[][a;] (as at 
least one of Ci and Ch is always satisfiable) . Otherwise, branch as E[l[a;l and 
F[x][CiU...UCk-{x}]. 

3. Branching rules for (1, k) literals. 

a) If X is a (1, k)[. . .][!, . . .] literal, then branch as E[][x]. This is a special 
case of rule 1. 

b) If X is a (I, fc)[2+][. . .] literal, then if C is the clause containing x, branch 
as E[][x] and F[x][C — {x}]. This is a special case of rule 2b. 

c) If X is a (1, k)[. . .][2, . . .] literal, then if C is the clause of size 2 containing 
X, branch as E[][x] and F[x, C — {x}][]. This gives a (fc, 2) branch. 

4. Let X be a {k,l) literal and let C'i,...,Cfc be the clauses containing x. If 
there is a variable y such that there are only one or two occurrences of y in 
clauses other than Ci, . . . ,Ck, then branch as E[x][] and E[][x]. This gives a 
(k+l,l) branching vector. For, setting x = true and applying the reduction 
rule will eliminate y, thus satisfying at least fc + 1 clauses. 

5. If X is a (2, 2) literal, and Ci and C 2 are the clauses containing x. Branch as 
F[x][],F[][x,Ci - {x}] and F[][x,C '2 - {x}]. 
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Since, if there is an optimum assignment with x = false and C\ — {x\ = true 
and C 2 — {ai} = true, then we might as well set x = true. 

6. Let a: be a (1, 3)[1][. . .] literal and C\,C 2 and C3 are the clauses containing x. 
Then, if there exist some two clauses, say C 2 and C3 which do not contain any 
pair of complementary variables, it is sufficient to branch as f[x] [] , -F[] [x, C\ — 
{a;}] and F\\[x,C 2 — {x},C 3 — {a;}]. Otherwise, branch as i^[a;][]. 

Since if Co(C'i, 02), Co{Ci,C 3 ) and Co(C2, C3), then it can be seen that for 
any assignment at least two of the clauses in Ci,C 2 and C3 are satisfied. 
Otherwise, if there is an optimal solution with x = false such that at least 
two of Cl — {a;}, C 2 — {a;} and C3 — {a;} are true, then setting x = true does 
not decrease the number of clauses satisfied. 

7. If a; is a (1, 2) literal, and Ci and C 2 are the clauses containing x. Branch as 
F[x][],F[][x,Ci - {x},C 2 - {x}]. 



5 Algorithms for General MAXSAT 

In this section we present two algorithms for the general MAXSAT. One has the 
number of clauses as the exponent in the running time and the other has the 
length of the formula as the exponent in the running time. As in the previous 
section, we will first employ the reduction rules as far as possible and work with 
the reduced formula. It is also assumed that after applying every branching rule, 
any applicable reduction rules are applied. 



5.1 A Bound With Respect to the Number of Clauses 

We note that if {x} is a unit clause then whether x is set true or false, the clause 
will be eliminated. This means that if x is a {k,l) literal, such that n„(x) = i 
(recall that n„(x) is the number of times x occurs as a unit clause in the formula) 
and n„(x) = j, then branching as C[x][] and C[][x] gives us a (fc + j,l + i) 
branch for the non-parameterized case (number of clauses eliminated), and a 
(k,l) branch for the parameterized case (number of clauses satisfied). In the 
following if we say that a branch is (r, s) for the parameterized case, we mean 
that in one branch r clauses are satisfied and in the other s clauses. Similarly 
an (r, s) branch for the non-parameterized case is a branch where r clauses are 
eliminated in one and s in the other. Note also that an (r, s) branch for the 
parameterized case is also an (r, s) branch for the non-parameterized case. 
Algorithm: 

1. If there is some (1,5-1-) or a (2-|-,3-|-) literal x, branch as A[x][] and A[][x]. 
This gives good branches for the parameterized question (and hence for the 
general MAXSAT as well). 

2. If there is some (1, 3-|-)[2-|-][. . .] literal x, branch as A[][x] and F[x][{C' — 
{x}}], where C is the clause containing x. This is a (3,2) branch for the 
parameterized case. This is essentially branching rule 3b. 
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3. If there is some (1,4)[1][. . .] literal x, branch as F[x][] and -F[][®]' Clearly, 
a (1,5) branch for the non-parameterized case and a (1,4) branch for the 
parameterized case. 

4. If there is some (1, 3)[1][2, . . .] literal x, let C denote the length 2 clause 
containing x (choose any one if many). Branch as F[][a;] and F[x, {C— {x}}][]. 
Branching rule 3c. Clearly a (3, 2) branch for the parameterized case. 

5. If there is some (2, 2) [1, ...][.. .] literal x. According to branching rule 2, 
we get that, if C\ and C 2 are the clauses containing x, branch as A[a;][] if 
C'o(C'i, C 2 ). Otherwise branch as A[a:][] and F[][a;,C'i U (72 — {a;}]. Clearly a 
(2, 3) branch for the parameterized case, since \C\ U (72 — {x}| > 1. 

After the above rules can no longer be applied, the formula just contains 
(1, 3)[1][3+, 3+, 3+] literals and their complements, (2, 2)[2+, 2+][2+, 2+] li- 
terals and 3-variables. 

6. If y is a 3-variable which occurs in a clause containing a (2,2) literal x, then 
branch as A[x][] and i^[][a:]. Thus, branching rule 4 gives a (2,3) branch for 
the parameterized case. 

7. If y is a 3-variable which occurs in a clause containing a (3,1) [. ..][!] literal 
X, then if y occurs in all the 3 clauses containing x, branch as A[][a:]. This 
is because y can be assigned so that at least 2 of the 3 clauses containing 
X are satisfied with x = false. Otherwise, branch as i^[a;][] and A[][a:]. This 
gives a (1,5) branch for non-parameterized case and a (1,4) branch for the 
parameterized case. 

Now any clause containing a 4.-variable contains only ^-variables. 

So the subset of clauses containing 3-variable is closed. This can be elimi- 
nated efficiently using the algorithm alluded to in the Introduction [2]. Thus 
the formula now contains only ^-variables. 

8. If a: is a (2, 2) literal and some y occurs in the two clauses containing x or 
in the two clauses containing x , then branch as A[x][] and T’ON- ^ (2,3) 
branch for the parameterized case. 

9. If a; is a (1,3) literal and any y occurs in two or more clauses containing x 
then branch as F[] [a;] and F[x] [] . Thus, branching rule 4 gives a (1,4) branch 
and a (1, 5) branch for the parameterized case and the non-parameterized 
case respectively. 

At this state, note that if a; is a (1,3) literal, and (7i,(72 and are the 
clauses containing x, then except for x they all have disjoint variables. Also, 
let fli be a (2,2) literal, and Ci and C 2 be the clauses containing oi. Let 02 
be any other literal, then after steps 8 and 9 of the algorithm at most one of 
Cl and C 2 can contain 02 . 

This observation gives us the following lemma. 

Lemma 1. Let a±, 02 , . . . ,ai be (2, 2) literals and let C\, . . . ,Ck be precisely 
the clauses that contain some di, 1 < * < L After step 9 of the algorithm 
k > T{1) 

where T(l) = minmax{|"y],_) -|- 1}, where the minimum is taken over all 
I < j < 1. In particular k > \ {21 -\- | ) -|- | ] . 
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Proof. Without loss of generality, let Ci be the clause such that {Ci fl 
{oi, 02, . . . , a;}} has maximum cardinality j. Then the number of clauses 
needed is at least [ j] . Also by the condition imposed by step 9 of the algo- 
rithm, each of the other occurrences of the j variables must occur in different 
clauses, thus at least j additional clauses are needed. Thus number of clau- 
ses k will be at least T{1) = min^ max{ [y] , j -|- 1}. It can be verified that 

T{1) > \^{2l+ j) -I- i]. 

10. Let X be a (1, 3)[1][. . .] literal and let C\,C 2 and C3 be the clauses containing 
X. If Cl contains a (2, 2) literal a or has size 4 or more then branch as 
F[x\ [], F[] [x. Cl — {x}], F[] [x, C2 — {x}, C3 — {x}]. This is essentially branching 
rule 5&. 

a) If y is a (1,3) literal such that y G Ci, then it is y that belongs to Cj. 
Since all (1,3) literals are (1,3)[1][. . .] literals. 

b) If there are p (3, 1) literals u\, . . . ,Up and q (2, 2) literals V\,V 2 , ■ ■ ■ ,Vq 
in Cl — {x}, then setting Ci — {x} = false eliminates at least p + T{q) 
clauses, other than Ci,C2 and C3. This is because, as uf s are present 
as unit clauses, setting u[s = false eliminates at least p clauses and 
since the v/s will occur in clauses other than C2 and C3 (after step 9 of 
algorithm), setting v[s = false will eliminate another T(q) clauses. 

F[][x, {Cl — |x}}] satisfies at least 3 clauses other than Ci, C2 and C3, since 
p+T{q) > 3 for (Ci — |x}}. Thus 6 clauses are satisfied and 7 are eliminated 
by the assignment. 

Similarly, since {C2 — |x}} and {C3 — |x}}] have no common variable, the 
assignment C[][x, {C2 — |x}}, {C3 — |x}}] satisfies 4 clauses other than Ci, C2 
and C3. Thus 7 clauses are satisfied and 8 are eliminated by the assignment. 
Thus we have a (1, 7, 8) branch for the non-parameterized case and a (1, 6, 7) 
branch for the parameterized case. 

Now the clauses containing (1, 3)[1] [3, 3, 3] literals and their complements 
form a closed subformula Fi . The remaining clauses form a closed subformula 
F 2 of (2,2) literals. So, F = Fi A F 2 . Rule 11 describes how to eliminate Fi 
efficiently. 

11. If a is a (1,3) literal, then the clauses with a will be {a}, (a, 6, c}, (a, . . .}, 
{a,...}. We branch as F[] [a], F[a, &] [] and F[a, c][6]. 

We know that b and c will be (3, 1) [3,3,3] [1] literals, if a is true and both 
b and c are false, then a can be set to false. For the parameterized case, 
F[][a] and C[a, 6][] satisfies 3 and 4 clauses respectively, F[a,c][&] satisfies 
5 clauses (the unit clauses {a}, {6} and the 3 clauses containing c). Thus 
we have a (3,4,5) branch for the parameterized case. Similarly we have a 
(4, 5, 6) branch for the non-parameterized case. 

12. If X is a (2, 2)[2+, 2+][3+, 3+] literal, let Ci and C 2 be the clauses containing 
X. Apply branching rule 5a and branch as F’[x][],F[][x, (Ci — |x}}] and 

F[][x,{C2-|x}}]- 

The second and the third branch will eliminate at least 2 -|- T(2) = 5 clauses 
each. F[x] [] eliminates 2 clauses and also leads to 3— variables in the formula; 
now either these variables would occur as a closed subformula or in a clause 
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with some (2, 2) literal. Thus the next step would at least be a (2, 3) branch or 
better, since steps 6 of this algorithm and branching rules for (n, 3) MAXSAT 
are all good for the parameterized case. Thus we have a (4, 5, 5, 5) branch for 
the parameterized case. 

13. Any A — variable a will be of the type (2, 2) [2, 2+] [2, 2+]. Let C\ = {a,b} be 
the clause of size 2 containing a. We will show how to branch with a = false. 
The case with a = true is similar. 

a) If b is present in some clause containing a. Then just branch as A[&][a]. 
By rule 8 of this algorithm, it must be that b occurs in the clause with a; 
thus setting a = false the clauses containing b will reduce to {6}, {b, . . .} 
and {6, . . .}; thus setting b = true is sufficient, and this is essentially 
branching rule 3a. 

b) Otherwise, If C[ and C 2 are the clauses containing 6, then branch as 

F[b] [a] and F\\ [a, b, U C2 — 6] . Clearly F[b] [a] eliminates 4 clauses, and 

the second branch eliminates the clauses containing 6, the unit clause 
{b} and another 3(= T{2)) clauses corresponding to Cj U C2 — b. So at 
least 6 clauses are eliminated in the non-parameterized case and at least 
5 clauses in the parameterized case. 

Thus in the worst case the branch is a (4, 6, 4, 6) branch for the non- 
parameterized case and a (4,5,4, 5) branch for the parameterized case. 
For the non-parameterized case, the branch (4, 5, 5, 5) is the most restrictive 
giving an 0(1.341294™ |F|) algorithm for MAXSAT by solving the appropriate 
recurrence relation. 

For the parameterized case, the branch (1,4) is the most restrictive giving 
an O(1.380278^fc^ -I- \F\) algorithm by reducing the problem to the kernel and 
applying the above branching algorithm. 

Thus we have proved the following. 

Theorem 1. For a formula F in CNF on n variables and m clauses, an assign- 
ment satisfying the maximum number of clauses can be found in time 
0(1.341294™|F’|). Also an assignment satisfying at least k clauses of F, if exists, 
can be found in time 0(fc^l. 380278^ -I- |F|). 



5.2 A Bound With Respect to the Length of the Formula 

In this section, we call a branching vector good if either it is a (4, 11), (5, 10), (6, 8) 
or a (7, 7) vector. 

We observe that if a; is a {p, q) literal, setting x = true reduces the length of 
the formula by at least l{x) q (recall that l{x) is the sum of the lengths of the 
clauses containing x.) Also since every variable i is a 3'^ — variable after applying 
the reduction rules, setting a value to x reduces the length of the formula by at 
least 3. 

Using these facts and observing that both x and x cannot be present as 
unit clauses together, simply branching as F[cc][] and F[][x] gives the recurrence 
T{\F\) < T(|F| — 3) -I- T(|F| — 4) -|- |F| for the running time. This gives a very 
simple 0(1.2207451^1 IT’D algorithm. 
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In what follows, we argue through a number of cases to obtain a more efficient 
algorithm. 

Algorithm 

1. If i is a 7“*' — variable, then branching as and F[][a;] gives a good 

branch. 

2. If a; is a (3, 3) literal, then either l{x) or l{x) > 6, since only one of x and x 

can appear in a unit clause. Thus, branching as and gives us a 

good branch. 

Now the formula contains only (1, k) literals, for fc < 5 or (2, k) literals, for 
fc < 4. 

3. If a; is a (I, A:)[. literal, then just branch as i^[][a;]. This is from 
branching rule 3a. 

4. The following rules give good branches for fc > 3. 

a) If a: is a (1, fc)[2+][. . .] literal, then just branch as F[][a;] and F[x\[C —{x}] 
where C is the 2“''-clause containing x. This is branching rule 36 and this 
gives a {2k + 1, fc + 4) branch. 

b) If a: is a (1, fc)[I][2, . . .] literal, then just branch as ^nd F[x, C — 

{a;}][]. C is the clause containing x of size 2. This is essentially branching 
rule 3c. Clearly this gives us a (2fc + 1, fc + 4) branch. 

c) If a; is a (I, fc)[I][3+, . . .] literal, and l{x) > 10, then just branch as T’ON 

and This gives a (fc + 1, l{x) + 1) branch. 

So now if X is a (l,fc) literal, it has to be a (1, 3) [1][3, 3, 3] literal or a (1,2) 
literal. 

5. If X is a (2, fc)[. . .][!, 1, . . .] literal then, we branch as T’[][x]. This is simply 
the branching rule 1. 

6. The following rules for eliminating (2, fc) literals, give good branches for fc > 3. 

a) If X is a (2, fc) [...][!, 2+, . . .] literal then, we branch as T"[x][] and T’[][x]. 
This gives a (4+fc, 2fc+ 1) branch, since x cannot appear in a unit clause. 

b) If X is a (2, fc)[. . .][2+, . . .] and l{x) > 3, we branch as F[x][] and 
This gives a (3 + fc, 2fc + 2) branch. 

7. If X is a (2, fc)[l, 1][2+, . . .] literal, branching as F[x][] and F[][^] gives a 
(2 + fc, 2fc + 2) branch, which is good for fc > 4. 

8. If X is a (2, 3)[1, 1][2+, . . .] literal, let Ci,C2 and C3 denote the clauses con- 
taining X, 

a) If Co{Ci,Cj) for some 1 < t, j < 3, branch as T’ix])]. 

b) If Cl, C2 and C3 contain only one variable other than x, then Ci = C2 = 
C3 = {x, c}, thus we branch as J^[x, c][] and F[][^]i which is at least an 
(8, 8) branch. This is because since c will be almost a 6 — variable, setting 
X = true makes c a (3“*', 3“)[1, 1, 1, . . .][. . .] literal, and so c = true in an 
optimal solution. 

c) If Cl, C2 and C3 contain 2 or more variables other than x, then branching 
as F[x][] and Ci U C2 U C3 — {x}] gives a (5, 11) branch. 

Thus the formula contains only 3— variables, {2, 2) literals and (1, 3)[1][3, 3, 3] 
literals. 
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a) The rules below eliminate (1, 3)[1][3, 3, 3] literals. If x is a (1, fc)[l][3, 3, 3], 
and some clause containing x contains a 3 — variable y, then if all the 3 
occurrences of y are in clauses containing x then, just branch as 
Otherwise branching as f[x][] and f0[^] gives a (4, 11) branch. 

b) If X is a (1,3)[1][3,3,3], and some clause containing x contains two or 
more occurrences of a 4 — variable y, then branching as F[x] [] and F[] [x] 
gives a (4, 11) branch. 

c) Now, if X is a (1, 3) [1] [3, 3, 3] literal, then the clauses containing x contain 
single occurrences of 4 — variables other than x. We show that branching 
according to the rule 56, gives a (4, 16, 22) branch. Clearly F[x] reduces 
the length by 4, F[][x] reduces the length of the formula by 10, now in 
F [] [x] , I Cl I = 2 and | C2 U C3 1 =4 and since the variables in Ci U C2 U C3 
are 3 — variables, we reduce the length of the formula by at least 3 for 
each assignment of these. 

The formula contains only (2,2) literals and 3 — variables. 

9. If X is a (2,2)[3+,3+][2+,2+] literal or a (2, 2) [2, 3+] [2, 3+] literal, branching 

as C[x][] and C[][x], gives a good branch. 

Thus the formula just contains (2, 2) [2“, 2“] [2“, 2“] literals. 

10. If X is a (2, 2) [1, 2] [2, 2] literal. Let C denote the 2 — clause containing x and 

Di , T>2 denote the clauses containing x. 

a) If C U U I?2 — {x, x} contains only one variable, then the clauses 
containing x, must be {x}, {x, c}, {x, c} and {x, c}. We branch as F[x][c], 
it can be seen that this is a valid branch as c is a 3 — variable or a (2, 2) 
literal. 

b) li CU DiU D 2 — {x, x} contains three variables. Branching as F[x] [] and 
F[C — x\[x, T>iUZ ?2 — {x}], we get a (5, 13) branch. This is because C[x][] 
reduces the length by 5, and assignment to x reduces the length by at 
least 4 and assignment to each of the other three variables reduces the 
length by at least 3. 

c) If CUZ?i UC2 — {x, x} contains two variables then the clauses containing 
X can be 

i. {x}, {x, 6}, {x, 6} and {x, c}. Branching as F"[x][] and F[C—x\[x, I?iU 
D 2 — {a;}]) we get a (7, 10) branch. For if 6 is a 3 — variable, then 
it will be eliminated by setting x = true. Otherwise if 6 is a (2, 2) 
literal, then the clauses containing 6 will be {6}, {6, . . .} and {6, . . .}, 
which can be directly eliminated by setting 6 = false. 

ii. {x}, {x, 6}, {x, c} and {x, c}. Branching as F’[x][] and F[C—x][x, DiU 
D 2 — {x}], we get a (7, 10) branch, since setting x = true, then Di, I?2 
will become unit clauses {c}, {c}, for which we will only have one way 
branch c = true. 

11. If X is a (2, 2) [2, 2] [2, 2] literal, then 

a) if some clause containing x contains a 3 — variable y, then if number of 
occurrences of y in the clauses containing x is 1, then setting x = true 
reduces the length of the formula by at least 8. So F[x][] and F’ON 
gives a (6,8) branch. Otherwise, if the number of occurrences of y in 
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the clauses containing a; is 2 then setting x = false gives 2 unit clauses 
containing y, which can be eliminated directly. So and gives 

a (6,8) branch. 

b) Otherwise branch as and and apply rule 11 of this algorithm 
for each instance. Setting x = true or x = false reduces the length of the 
formula by 6, but this leads to a (2,2)[1, . . .][2,2] literal. So, we either 
have a (7, 10) branch in the next step or a (5, 13) branch. Thus we a 
(13, 16, 13, 16) branch or a (11, 19, 11, 19) branch in the worst case. 

12. The formula now contains only 3 — variables, we now apply the algorithm 
for (n, 3) MAXSAT alluded to in the Introduction [ 2 ]. Since the length of 
the formula I, will be equal to 3n (since 2~ — variables do not exist in the 
formula), we can eliminate the formula in 0(1.3247195 |F|) time. 

The branch (4,11) is the most restrictive thus giving an 0(1.1057291'^! IT’D 
algorithm by solving the appropriate recurrence relation. 

Thus we have proved 

Theorem 2. For a formula F in conjunctive normal form, an assignment sa- 
tisfying the maximum number of clauses can be found in time 0(1.105729!^! IT’D- 



6 Conclusions 

It would be interesting to see how our algorithms perform in practice. Also 
it would be useful to identify some general branching techniques which would 
reduce the number of cases in the algorithms. 
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Abstract. A positive (or monotone) Boolean function is regular if its 
variables are naturally ordered, left to right, by decreasing strength, so 
that shifting the non-zero component of any true vector to the left always 
yields another true vector. In this paper, we propose a simple linear time 
algorithm to recognize whether a positive function is regular. 



1 Introduction 

A Boolean function, or a function in short, is a mapping / : {0, 1}" !-->■ {0, 1}, 
where v € {0,1}" is called a Boolean vector (a vector in short). Let V = 

{1,2, ...,nj. For a pair of vectors v,w € {0,1}", we write v < w if Vj < Wj 

holds for all j G V, and v < w if v < w and v ^ w, where we define 0 < 1. If 

f{v) = 1 (resp., 0), then v is called a true (resp., false) vector of /. The set of 

all true vectors (resp., false vectors) of / is denoted by T(/) (resp., F{f)). We 
use notations ON{v) = {j \ vj = 1, j G V} and OFF{v) = {j \ Vj = 0, j G U}. A 
function / is positive if v < w always implies f{v) < f{w). A positive function 
is also said to be monotone. A true vector u of / is minimal if there is no other 
true vector w such that w < v. Let minT(/) denote the set of all minimal true 
vectors of /. 

If / is positive, it is known that / has a unique minimal disjunctive normal 
form (DNF), consisting of all prime implicants. There is a one-to-one corre- 
spondence between prime implicants and minimal true vectors. For example, a 
positive function / = xiX 2 V X 2 X 3 V X 3 X 1 , has prime implicants X\X 2 , X 2 X 3 , and 
X 3 X 1 which correspond to minimal true vectors (110), (011), and (101), respec- 
tively. Thus, the input length to describe a positive function / is 0(n|minT(/)|) 
if it is represented in this manner. 

A positive function / is said to be regular if, for every v G {0, 1}" and every 
pair {i,j) with i < j, Vi = 0 and Vj = 1, the following condition holds: 

f{v) < f{v + (1) 

where denotes the unit vector which has a 1 in its A:-th position and 0 in 
all other positions. A positive function / is called 2-monotonic if there exists 
a linear ordering on V, for which / is regular. Let Cn and C 2 M, respectively, 
denote the classes of regular and 2-monotonic functions. 
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The 2-monotonicity and related concepts have been studied in various con- 
texts in fields such as threshold logic |8I6I12I18| . game theory, hypergraph theory 
lU and learning theory |4|9|10| . The 2-monotonicity was originally introduced in 
conjunction with threshold functions (e.g., m), where a positive function / is a 
threshold function if there exist n -I- 1 nonnegative real numbers wi,W 2 , ■ ■ ■ , Wn 
and t such that: 



fix) 



1, if > t 

0, if Y. < t. 



As this / satisfies o by permuting variables so that Wi > wj implies i < j, a, 
threshold function is always 2-monotonic, although the converse is not true Pi- 
Let Cth denote the class of all threshold functions. 

In this paper, we consider the recognition problem for regular and 2-monotonic 
functions. The recognition problem is defined for a class of positive functions C 
as follows: 



Problem RECOG(C) 

Input: minT(/), where f is a positive function of n variables. 

Question: Does / belong to C ? 

The recognition problem is also called the representation problem in com- 
putational learning theory, and has been studied to prove the hardness of learn- 
ability jj. It is known [J that, given a polynomially size-bounded and poly- 
nomially reasonable class of functions C, if C is polynomially exactly learnable 
with membership, equivalence, exhaustiveness, disjointness and superset que- 
ries, then RECOG(C) is in coNP. This fact is used to show that, if REGOG(C) 
is NP-hard under reductions (also called many-one reduction), then C is not 
polynomially exactly learnable with the above queries unless NP=coNP. 

Problem REGOG(C 2 m) has been used as a server to solve REGOG(CTi/) 
[11 81618] . Problem REGOG (Ct//) is a classical problem in threshold logic [T2] . 
and is also called the threshold synthesis problem. The polynomial solvability of 
this classical problem was open for a number of years, until the publication by 
U.N. Peled and B. Simeone m- Their algorithm consists of the following three 
steps. The first step solves REGOG(C 2 m)- If / is not 2-monotonic, then output 
“No” and halt. Since Cth C C2M, we can conclude that / is not threshold in this 
case. The second step computes the set of the maximal false vectors maxF(/). 
If / is 2-monotonic, we can compute maxF(/) in 0(n^ | min T(/)|) time [6l8j . 
Finally, we check if there is a hyperplane = t that separates minT(/) 

from maxE(/), which can be solved in polynomial time [8I7J . 

Besides them, problem REGOG(C 2 m) is important in practical applications, 
since a number of intractable problems become tractable, if we restrict our at- 
tention to 2-monotonic functions. Such examples include set covering problem 
m\ , reliability computation [2], dualization problem mm and exact learning 
problem m- 

Let us consider the definition of regularity. It is easy to see that it is equivalent 
to the following: for every v £ minT(/) and every pair (i, j) with i < j, Vi = 0 
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and Vj = 1, 



f{v + = 1. (2) 

This implies that problem RECOG(C/j) can be solved in 0{n^ \ minT(/)p) time. 
J. S. Provan and M. O. Ball [14] have improved it to 0(n^| minT(/)|) time. Since 
I min T(f) \ ^ n can be expected in most cases, this is a significant improvement 
upon the straightforward approach. To achieve the improvement, they make use 
of two facts. The first is given in the following proposition: 

Proposition 1. |12| Let f be a positive function. Then f is regular if and only 
if, for all pairs of v G minT(/) and i gV with Vi = 0 and Vi+\ = 1, 

/(t; + e« = 1 (3) 



holds. 

This proposition leads to an 0(n^ | min T(/)p)-time algorithm. 

The next one is an 0(n)-time membership oracle (i.e., an algorithm to check 
if f{v) = 1 or not for a given vector v) for a regular function /. It makes use 
of a binary tree as a data structure for minT(/). For instance. Figure [T] shows 
such a binary tree for a positive function 

/ = X\X2 V X1X3X4 V X1X3X5 V X2XSX4 V X2X3X5XQ, ( 4 ) 

i.e., minT(/) = = (110000), = (101100), = (101010), = 

(011100), z;^®^ = (011011)}. The importance of their membership oracle is that, 
even if / is not regular, the answer is correct if it outputs “Yes” (but may not be 
correct if it outputs “No”). Therefore, for each v G minT(f) and i with Vi = 0 
and zzi+i = 1, we can check condition in 0(n) time. By the above discussion, 
if the oracle outputs “Yes”, then it is correct, i.e., f{v + = 1. On 

the other hand, if the oracle outputs “No”, then it may be wrong, but we can 
conclude that / is not regular (If the answer is correct, f{v + = 0, 

implying that / is not regular; otherwise, / is not regular, since the oracle gives 
the wrong answer). Hence 0(n^| minT(/)|) time is achieved. 

As for the 2-monotonicity, R. O. Winder shows the following nice proposition. 

Proposition 2. |1 5J Let f be a 2-monotonic positive function of n variables. 
Define the n-dimensional vectors (j G V) by 

^ = |{z; G minT(/) : Vj = 1, |OY(z;)| = k}\, k G V. 

Let a^L) >lex ■ ■ ■ >lex where >lex denotes the lexicogra- 

phic order between n-dimensional vectors, and let n be a permutation on V such 
that Tr(ji) = i for all i. Then f becomes regular by permuting V by tt. 

This proposition says that, in order to check the 2-monotonicity of a positive 
function /, we first permute V by the above tt, and then check the regularity of 



262 



K. Makino 



the resulting function. Since we can permute by tt in 0{n^+n\ minT(/)|) time, 
the result by J.S. Provan and M.O. Ball also implies an 0(n^ | min T(/)|)-time 
algorithm for checking the 2-monotonicity. 

In this paper, we present an 0(n\ minT(/)|) time-algorithm for checking the 
regularity of a positive function /. Our algorithm makes use of a fully conden- 
sed binary tree {FCB) as a data structure for minT(/) (see the definition in 
Section The algorithm checks condition ((3]) by finding a “left-mosf vector 

w € minT(f) such that ic < u -I- It explores a fully condensed 

binary tree in the breadth-first fashion. Since the number of nodes in FCB is 
0(1 minT(/)|), our algorithm only requires 0(n| minT(/)|) time. 

By combining this with the result by R. O. Winder, we also provide an 
0(n(n-|- I minT(/)|))-time algorithm for checking the 2-monotonicity of a posi- 
tive function /. 

The rest of this paper is organized as follows. Section 2 introduces three 
types of binary trees called ordinary, partially eondensed, fully condensed binary 
trees as data structures for minT(/). In Section 3, we give a simple algorithm 
for RECOG(C/{). The algorithm uses a partially condensed tree, and its running 
time is 0{n^\ minT(/)|), which is equal to the best known result by J. S. Provan 
and M. O. Ball [14|. Section 4 improves the algorithm described in Section 3 to 
a linear time one by making use of a fully condensed tree. 

For space reasons, proofs of some results are omitted (see [TT)~). 

2 Data Structures for miuT(/) 

In this section, we present three data structures to represent minT(/). The 
first one is an ordinary binary tree R(minT(/)) of height n, in which the left 
edge (resp., right edge) from a node at depth j — 1 represents the case xj = 1 
(resp., Xj = 0). A leaf node t of B{S) at depth n stores the vector v £ S {C 
{0, 1}"), the components of which correspond to the edges of the path from 
the root to t. In order to have a compact representation, the edges with no 
descendants are removed from B{S). For example. Figure [T] shows the binary 
tree for minT(/) = = (110000), = (101100), = (101010), = 

(011100), = (011011)}. 

The next one is called partially condensed binary tree PCB(inmT{f)). It is a 
binary tree obtained from B(minT(/)) by recursively removing all leaves t such 
that its parent has no left-child. Figure Ogives an example. 

We finally define a, fully condensed binary tree FCB(imiiT(f)). It is a binary 
tree obtained from PCB{mmT{f)) by recursively shrinking all edges (si,S 2 ) 
such that Si has no right-child. Differently from B and PCB, every node s 
of FCB has a label T(s), where L{s) is either 0 or the interval of variables 
[L(s)i, ^( 5 ) 2 ] = [xi,Xj]. Let s be a node in FCB obtained from a path ti — )> 
t 2 — t . . . — t f; in PCB by shrinking all edges (ti,ti+i) for i = 1, 2, . . . , Z — 1, 
where (f^, t^+i), i = 1, 2, . . . , Z — 1, has a label Xp+i = 1. If s is either a right-child 
or the root, then L(s) = [Xp+i,Xp+i-i] for I yf 1, and L(s) = 0 for ^ = 1 (i.e., s 
is constructed from a node ti in PCB). On the other hand, if s is a left-child. 
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X\ — 1 



O 



= 0 



0:2 = 1 O *2 = 0 *2 = 10 




Fig. 1 . A binary tree R(minT(/)), where / is given by 



*1 = 1 
S2 



Si o 



* 1=0 



S3 



*2 = 1 O * 2=0 



*2 = 1 O 




Fig. 2 . A partially condensed binary tree PCS (min T(/)), where / is given by ([3J. 



then L{s) = [*p, Xp+/_i], since the edge {parent{ti),ti) in PCB has a label 
Xp = 1. Note that L{s) ^ 0 holds for all nodes s that are left-children of their 
parents. A leaf node t of FCB{S) stores the vector v G S {C {0, 1}"), where v is 
represented by the sequence of the labels appearing on the path from the root 
to t ] e.g., is represented by 0 , [*2,3^3] and [*4, *4], implying = (011100) 
(^ 2 ^^ = = 1 and = 0 for every other i). Figure shows the 

corresponding example to Figures [T] and [U 

For a node s in a binary tree {B, PCB or PCB), let parent(s), grandparent (s), 
left-child(s) and right-child{s) denote parent, grandparent, left-child and right- 
child of s, respectively. For example, node S 5 in Figure[2]has parent S 2 , grandpa- 
rent si, left-child S 7 , and no right-child. Hence parent^s^) = S 2 , grandparent{s^) = 
si, left- child ( 35 ) = 37 and right- child{s^) = 0. 
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@ Special nodes 

S3, [X2,X3] 

@ 

S7, [xs,xe] 

@ 

^(5) 



Fig. 3. A fully condensed binary tree FCB{minT{f)), where / is given by P|). 

3 An 0(n^| minT(/)|) Time Algorithm for RECOG(Cit) 

In this section, in order to easily understand a linear time algorithm for pro- 
blem RECOG(C/j), we first present an 0(n^ | min T(/)|) time algorithm for it. 
The algorithm makes use of a partially condensed tree as a data structure for 
minT(/). 

Let s be a node in PCB at depth d, where the root is at depth 0. Let 
cr(s) be a binary vector of length d, whose z-th component is the value of 
Xi on the path from the root to s. For example, we have cr(s 5 ) = (10) and 
o'(sii) = (0111) for nodes S 5 and ss in Figure 0 Node s is said to be spe- 
cial if the last two components of ct(s) is 0 and 1, respectively. That is, s is 
a left-child and its parent is a right-child. For a special node s, a node t is 
called support if either t = right- child (left- child (grandparent (s))) or t is a leaf 
and t = left-child(grandparent(s)) . Let support(s) denote the support of a spe- 
cial node s. For example, nodes S 6 ,S 7 ,si 3 and S 14 in Figure 0 are special, and 
their supports are support(se) = S 5 , support(sr) = S 4 , support(si^) = sg and 
support(si 4 ) = sii. For a vector v € {0,1}" and a nonnegative integer k, let 
ON(v)k denote the first k elements in ON(v) in the order of variables; e.g., for 
a vector v = (110101), ON ( 0)2 = {1,2} and ON(v)g = {1,2,4}. 

The next lemma is crucial for our algorithms. 

Lemma 3. A positive function f is regular if and only if for each pair of v G 
minT(/) and i G V with Vi = 0 and Vi+i = 1, there exists a vector w G minT(/) 
such that 

iG ON(u;) = ON(u + e«-e(*+i))|o^MI (5) 

The following algorithm tries to find a vector w G mmT(f) satisfying (0 for 
each pair (v,i) such that v G minT(f) and i G V with Vi = 0 and Vi+i = 1. 
Note that, in PCB(minT(f)), if a leaf representing a vector v G minT(f) is 
a descendant of a special node s, then Vi = 0 and Ui+i = 1 for some i, and 
these are the labels of the two edges immediately upstream from s. Thus the 
reader can imagine the path representing the imaginary vector v -\- 
in PC'S (min T(/)). 

In the algorithm, the set of nodes S(s) is associated with each node s in PCB. 
S is initialized as 5'(s) = 0 for all nodes s. We scan the nodes of PCB(mmT(f)) 



S4, [2:2, 2:2] 

O 

„(i) 



S2, [2;i,2;i] 

O 



ss, [2:3, 2:3] 




si. It 

O 



Sq , [2^4 , 2^4] 

O 

S9, [xs, 2:5] ^^4^ 
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breadth-first. The node s (say, at depth d) whose S(s) becomes non-empty is 
a special node {support(s) is added to it). Intuitively, if a leaf representing v € 
minT(/) is a descendant of s, then a{support{s)) can be the first d components 
of the vector w in ( 0 . Once S{parent{s)) becomes non-empty, S{s) is updated, 
so that the following property is maintained; if t S S{s), then a{t) is a prefix of 
w in ©• Finally, if / is regular, for every leaf t representing v € minr(/), S{t) 
will contain the set of all w’s in (|5]). S{t) will contain as many w's as there are 
special nodes on the path from the root to t. 

Algorithm CHECK-PCB 

Input: minT(/), where / is a positive function of n variables. 

Output: If / is regular, “Yes”; otherwise “No”. 

Step 0. construct a partially condensed binary tree PCS (min T(/)); 

Step 1. if there is a node s in PCS(min T(/)) such that left-child(s) — 0 and right- child{s) 7 ^ 0 
then output “No” and halt 
end; 

Step 2. for each node s do 
S{s) - 0 
end{for} ; 

d 2; /* Initialize depth d, where the root is at depth 0. */ 

Step 3. for each node s at depth d do begin 
if s = left-child{parent{s)) then 
call Procedure LEFT; 

if LEFT— 0 then output “No” and halt end 
else call Procedure RIGHT; /* i.e., s — right-child{parent{s)). * / 
if RIGHT— 0 then output “No” and halt end 

end; 
endjfor}; 
d := d + 1; 

if there is no node at depth d then goto Step 4; 

else goto Step 3; 

end; 

Step 4. for each leaf t do begin 

if S{t) contains an internal node then output “No” and halt end 
endjfor}; 

output “Yes” and halt. 

Procedure LEFT 

for each node t in S{parent{s)) do begin 
if t is a leaf then S{s) S{s) U {t}; 

else S(s) :— S{s) U {left-child{t)}; /* Note that left-child{t) 7 ^ 0. */ 

end 

endjfor}; 

if s is special then 

if support(s) — 0 then LEFT;= 0 and return; /* Regularity test fails.*/ 
else S{s) S{s) U {support{s)}‘, 

end 

end; 

LEFT:= 1 and return. 

Procedure RIGHT 

for each node t in S{parent{s)) do begin 
if t is a leaf then S{s) := S{s) U {t}; 

else if right- child{t) — 0 then RIGHT:= 0 and return; /* Regularity test fails. */ 
else S{s) :— S{s) U {right-child{t)^\ 

end 

end; 

endjfor}; 

RIGHT:— 1 and return. 

Although we omit the details (see [11]), Algorithm CHECK-PCB gives an 
0{v?\ minT(/)|) time algorithm for RECOG(Cfl). 

Theorem 4. Algorithm CHECK-PCB correctly checks if a given positive func- 
tion f is regular in 0{n^\ minT(/)|) time. 
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4 A Linear Time Algorithm for Problem RECOG(Cii) 

In this section, we describe a linear time algorithm for problem RECOG(Cfl). 
The algorithm makes use of the fully condensed tree as a data structure for 
minT(/). Let us start with redefining a special node and its support in FCB. A 
node s is said to be special if either 

s = left- child {parent (s)), parent{s) = right- child{grandparent{s)), 
and L(parent{s)) = 0 , or (6) 

s = right- child{parent{s)) and L{s) 7^ 0 . ( 7 ) 

For a special node s, we now define its support support{s). In order to clarify 
the definition, we consider the following four cases, separately. 

Case (z). Let s be a special node as defined by m such that L{s)i = L{s)2, 
where L{s)i denotes the z-th component if L{s). Then a node t is called the 
support of s if either 

(z.a) t = right- child{left-child{grandparent{s))), L{parent{t))i = L {parent {t)) 2, and 
L{t) = 0 , or 

{i.b) t is a leaf, t = left-child{grandparent{s))), and L{t)i = L{t)2- 

Case (zz). Let s be a special node as defined by ((El) such that L{s)\ < L{s)2, 
where we write Xi < Xj if z < j. Then a node t is called the support of s if either 

{li.a) t is a left-descendant of right-child{left-child{grandparent{s))), L {left-child {grand- 
parent{s)))i = L{left-child{grandparent{s)))2, and L{£)2 = L{s)2, or 
{li.b) t is a leaf, t is a left-descendant of right- child{left-child{grandparent{s))), L{left 
-child{grandparent{s)))i = L{left-child{grandparent{s)))2, and L{t)2 < L{s)2, or 
{li.c) t is a leaf, t = left-child{grandparent{s))), and L{t)i — L{t)2- 

Case (zzz). Let s be a special node as defined by ([7D such that L{s)\ = L{s)2- 
Then a node t is called the support of s if either 

{lii.a) t = right- child{left-child{parent{s))), L{parent{t))i = L {parent {t)) 2, and L{t) — 
0, or 

{lii.b) t is a leaf, t = left- child {parent {s))), and L{t)i = L{t)2- 

Case {iv). Otherwise (i.e., s is a special node as defined by ((7) such that 
L{s)i < ^(5)2), a node t is called the support of s if either 

{iv.a) t is a left-descendant of right- child {left-child {parent {s))), L {left-child {parent {s)))i 
= L{left-child{parent{s)))2, and L{t)2 = L{s)2, or 
{iv.b) t is a leaf, t is a left- descendant of right- child{left-child{parent{s))), L{left-child 
{parent{s)))i = L{left-child{parent{s)))2, and L{t)2 < L{s)2, or 
{iv.c) t is a leaf, t = left- child {parent {s))), and L{t)i = L{t)2- 

Note that the definitions of supports for cases (iii) and (iv) are, respec- 
tively, obtained from those for cases (i) and (ii) by replacing all occurrences 
of grandparent{s) by parent{s). Let support{s) denote the support of a special 
node s. For example, nodes 53,55,57 and 5g in Figure (SI are all special, and 
their supports are support{s^) = 55, support{s^) = 54, support{sr) = Sq and 
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support (sg) = sg. We can see that the definitions of a special node and its 
support in FCB correspond to those in PCB. 

Now we describe our algorithm. 

Algorithm CHECK-FCB 

Input: minT(/), where / is a positive function of n variables. 

Output: If / is regular, “Yes”; otherwise “No”. 

Step 0. construct a fully condensed binary tree FCB{mmT{f)); 

Step 1. if FCB{minT{f)) is not complete binary tree then output “No” and halt 
/* i.e., there is an internal node s having exactly one child */ 
else goto Step 2; 
end; 

Step 2. for each node s do 
S{s) - 0 
end{for}; 

d :— 1; /* Initialize depth d. * / 

Step 3. for each node s at depth d do begin 
if s = left-child{parent{s)) then 
call Procedure LEFT; 

if LEFT— 0 then output “No” and halt end 
else call Procedure RIGHT; /* i.e., s — right- child {parent {s)). * / 
if RIGHT— 0 then output “No” and halt end 

end; 
end{for}; 
d := d -\- 1 ; 

if there is no node at depth d then goto Step 4 ; 

else goto Step 3; 

end; 

Step 4 . for each leaf t do begin 

if S{t) contains an internal node then output “No” and halt end 
end{for}; 

output “Yes” and halt. 

Procedure LEFT 

for each node t in S{parent{s)) do begin 
if t is a leaf then S{s) := S{s) U {f}; 

else find a left-descendant t* of left-child{t) such that either L{t*)2 = -^'(< 5)2 or 
t* is a leaf and L{t*)2 < L{s)2- 

/* Since s and t* are left-children, L(s),L(t*) 7 ^ 0.*/ 

if there is no such t* then LEFT:= 0 and return; /* Regularity test fails.*/ 
else S{s) :— S{s) U {t*}; 

end 

end 

end{for}; 

if s is special then 

if support{s) = 0 then LEFT:= 0 and return; /* Regularity test fails. */ 
else S{s) := S{s) U { support {s)}‘, 

end 

end; 

LEFT:— 1 and return. 

Procedure RIGHT 

for each node t in S{parent{s)) do begin 
if t is a leaf then S(s) := S{s) U {t}; 

else find a left-descendant t* of right- child {t) such that either L{t*)2 = L{s)2 or 
t* is a leaf and L{t*)2 < L{s)2- 

/* L{r) = 0 is regarded as L{r) — [0, 0] for a node r. 

Assume that ^ k, k and ^ ^ k for all integer k. */ 
if there is no such f* then RIGHT:= 0 and return; /* Regularity test fails.*/ 
else S{s) :— S{s) U {t*}; 

end 

end 

end{for}; 

if s is special then 

if support{s) = 0 then RIGHT:= 0 and return; /* Regularity test fails. */ 
else S{s) := S{s) U { support {s)}‘, 

end 

end; 

LEFT:= 1 and return. 
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Theorem 5. Algorithm CHECK-FCB correctly checks if a given positive func- 
tion f is regular in 0(n| minT(/)|) time. 



Corollary 6. Problem RECOG(C 2 m) can be solved in 0(n(n + |minT(/)|)) 

time. 
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Abstract. In wireless communication, the signal of a typical broadcast 
station is transmited from a broadcast center p and reaches objects at 
a distance, say, R from it. In addition there is a radius r, r < R, such 
that the signal originating from the center of the station is so strong 
that human habitation within distance r from the center p should be 
avoided. Thus every station determines a region which is an “annulus of 
permissible habitation”. We consider the following station layout (SL) 
problem: Cover a given (say, rectangular) planar region which includes 
a collection of orthogonal buildings with a minimum number of stations 
so that every point in the region is within the reach of a station, while 
at the same time no building is within the dangerous range of a station. 
We give algorithms for computing such station layouts in both the one- 
and two-dimensional cases. 



1 Introduction 

In wireless communication we are interested in providing access to communi- 
cation to a region (e.g. a city, a campus, etc) within which several sites (e.g. 
buildings) are located. Closeness to stations may be undesirable in certain in- 
stances, e.g. hospital or laboratory facilities, people with heart pace-makers, etc. 
Thus, although we are interested in providing communication access everywhere, 
part of the buildings may need to be away from strong electronic emissions of 
stations. 



A. Aggarwal, C. Pandu Rangan (Eds.): ISAAC’99, LNCS 1741, pp. 269-|27^ 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 
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Cellular phones are radio receivers which operate in the ultra-high frequency 
(UHF) band. They receive radio transmissions from a central base station (or 
cell) at frequencies between 869 and 894 MHz and retransmit their radio signal 
back to the base station at frequencies between 824 and 850 MHz. Stations emit 
signals whose strength is inversely proportional to the square of the distance 
from the station. It follows that the signal’s strength degrades as we move away 
from the center of the station. This determines a threshold {IW is the currently 
accepted value) beyond which the signal is sufficiently safe but still strong enough 
to reach its desirable destination. A comprehensive study and survey of the 
biological effects of exposure to radio frequency resulting from the use of mobile 
and other personal communication services can be found in jH]. 

In this paper we consider broadcast station layouts in wireless communication 
in which we take into account health hazards resulting from the closeness of 
human habitation to the transmission station. Given such constraints we are 
interested in minimizing the number of broadcast stations used. The buildings 
are located within a region TZ, which for the sake of simplicity we assume to 
be rectangular. In the most general case the buildings may be represented by 
simple polygons with or without holes. 



1.1 Formulation of the Problem and Notation 

The parameters involved in transmissions for a typical station in the plane are 
the transmission center p of the station, and positive real numbers r < R such 
that R is the reachability range of the station, i.e. the signal transmitted from 
the center p can reach any destination at distance R from the center, and r is the 
dangerous range of the station, i.e. the strength of the transmitted signal exceeds 
permissible health constraints within distance r from the center. Let be 

the given distance function. The disc D(p-,r) = {a; : d{x,p) < r} is the locus of 
points that are “too close” to the broadcast center p. Existing health constraints 
make it advisable that human habitation is not allowed within the disc D{p- r). 
At the same time the signal reception does not cause a health hazard beyond 
distance r from the broadcast center of the station; moreover the signal can 
reach any location at distance at most R from the center. This determines an 
annulus A{p;r,R) — D{p\R) \ D{p,r). Thus A{p-,r,R) is the annulus formed 
by two squares centered at p and diameter 2r, 2i?, respectively. Throughout this 
paper we assume that d is the Li or Manhattan metric. 

The numbers r, R represent the parameters suggested by the manufacturer. 
In addition, we want to produce a layout of transmitting stations in such a way 
that all points of the region TZ are within range i? of a transmitting station while 
at the same time no site is within distance r from any transmitting station. More 
specifically, we have the following definition. 

Definition 1. A collection of m points A = {oi, 02 , ... , am} is called an (r, R)- 
cover for (7^, V) if the collection {D{ai; R) : i = 1,2,..., m} of discs covers the 
rectangular region TZ, but none of the discs D{ai,r),i = 1, . . . ,m have a point 
interior to any building in 'P. If r = 0 then an (0, R)-cover is also called an 
R-cover. 
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Problem 1 (General problem). 

Input: A rectangular region TZ and a collection P of simple polygons (or buil- 
dings) inside the region and two real numbers 0 < r < R. 

Output: An (r, i?)-cover A = {oi, 02 , . . . , Om} for {TZ,P) or a report that no 
such cover exists. 

The important parameter to be optimized is the number m of transmitting 
stations. In general we are interested in an algorithm that will report an optimal 
or even near-optimal number of stations. A cover is said to be optimal iff it uses 
a minimum number of stations. If V is the collection of buildings then we use 
the notation A(P; r, R) to abbreviate the square annulus version of the problem. 
We stipulate that every point in the region TZ must be within the reach of a 
station. At the same time although a point lying in a building cannot be within 
the dangerous zone of a station, this is not a priori prohibited if the point does 
not lie inside a building. In addition, it is permissible that a point (in a bulding) 
may lie within the range of more than one station. An important observation 
that will be used in the sequel is that Problem |T] is computationally equivalent 
to the following problem: 

Problem 2 (Reduced general problem). 

luput: A rectangular region TZ and a collection 'P of simple polygons (or buil- 
dings) inside the region and a real number 0 < R. 

Output: An R-cover A — {oi, 02 , . . . , am} for {P, P) or a report that no such 
cover exists. 

Clearly, Problem [T] is more general than Problem [21 To prove the reverse reduc- 
tion we surround each building with a strip of width r and merge the resulting 
orthogonal polygons into the new buildings. Details of the proof of this are left 
to the reader. 

1.2 Results of the Paper 

In the sequel we consider first the one-dimensional case of the problem. In the 
two-dimensional case, first we consider an algorithm for testing the existence of a 
solution. Subsequently, we show how to reduce the problem to a discrete problem 
in which the centers of the stations are to be located at predetermined points 
within the region P. This is used later on to provide (1) a linear time, logarithmic 
approximation algorithm by reduction to SET-COVER, (2) a polynomial time, 
constant approximation algorithm, and (3) for “thin” buildings, a linear time 
constant approximation algorithm. 



1.3 Ou the Number of Statious 

We observe that the size of an (r, i?)-cover for (7^, P), i.e., the number of points 
needed to cover a rectangular region P, is not only proportional to Area,{P) / R^ 
but also to the number of vertices of the given polygons P bounded by the 
rectangular region. 



272 



P. Bose et al. 



Theorem 1. There is a ractangular region TZ of area 0{B?) with a single poly- 
gon in V, such that any {r, R)- cover for it must be of size f2{n) where n is the 
number of vertices in the given polygon in V. 

As a consequence, the complexities of the given algorithms are best expressed as a 

function of the input size of the problem. This is defined to be + 

where n is the total number of vertices of the polygonal buildings and Area(72.) 
is the area of the given region TZ. 

2 Algorithm on the Line 

This section considers the one-dimensional analogue of the station layout pro- 
blem, problem 1-SL. In this case, the transmitting station is modeled by the 
one-dimensional analogue of the annulus, i.e., the set I{p]r,R) of points x on a 
line such that r < \x — p\ < R. The region is a line segment Iq, and the set of 
buildings is I — {/i, where each building Ij, 1 < j < n is an interval 

Ij = [pj^Qj] in the line segment Iq- In this version, an (r, i?)-cover for the in- 
stance at hand is a collection of m points A = {ai, 02 , . . • , Om} none of which 
is at a distance less than r from any interval in I, such that the collection of 
intervals /(a^; 0, R),i = 1, . . . , m covers the segment Iq. 

Problem 3 (One-dimensional problem). 

Input: A line segment Iq and a collection of (possibly ovelapping) intervals 
I = {Ji, inside the segment, and two real numbers 0 < r < R. 

Output: an (r, i?)-cover A = {oi, 02 , . . . , a^} for {Iq,T) or a report that no 
such cover exists. 



Theorem 2. There exists an 0{N log N) time algorithm for computing a mi- 
nimum size R-cover for an instance I of the 1-SL problem. 



3 Algorithms on the Plane 

In this section we study the case of orthogonal buildings and stations which are 
square annuli. 

3.1 Testing for a Solution 

Recall our previous reduction of Problem [1] to Problem |2] Hence without loss of 
generality, we may assume r = 0. 

Theorem 3. A solution exists if and only if there is no point p interior to a 
polygon (i.e., building) such that D(p;R) lies entirely inside the interior of the 
polygon. In particular, there exists an 0(max{7V, n^}) time algorithm to deter- 
mine whether or not a solution exists. 
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3.2 Reduction to a Discrete Problem Without Buildings 

Now we reduce Problem |2] to a discrete problem without any buildings (see 
Problem[4|). First we define a collection L of points inside the rectangular region 
as the union of two sets Lq and Li, to be defined below. 

1 . To obtain the points in Lq we partition the rectangular region TZ into parallel 
and horizontal strips at distance R apart and let Lq be the collection of points 
of intersection of these lines which lie outside any building in V. 

2. The points in Li lie on the perimeter of buildings in V. These points are of 
two types: (a) all vertices of these polygons, (b) for any polygon in V, and 
starting from an arbitrary vertex of the polygon, walk along the perimeter 
and place points on the perimeter at distance R apart. 

We refer to squares whose centers are points in L as L-squares. A discrete (r, R)- 
cover for the region 72. is a cover by L-squares. The basic lemma is the following. 



Lemma 1. An (r, R) -cover to the square version of problem Q] exists if and 
only if a discrete {r, R) -cover exists. Moreover, the size of an optimal discrete 
cover is at most four times that of an optimal cover and the (r, R) -cover can be 
constructed in time 0{N). 

The previous lemma reduces Problem [I] to the following one. 

Problem 4 (General discrete problem). 

Input: A rectangular region 72, a set L of points inside the region, and a positive 
number R. 

Output: A subset S of minimal size of the set L such that the set of squares of 
radius R centered at points of S cover the entire rectangular region. 

Conversely, it is easy to see that Problem can be reduced to Problem E] To 
see this consider an instance of Problem For each point p G L place a square 
D{p; R) \ {p}. Append these squares as part of the input set of polygons. It is 
clear that in the resulting instance of Problem [T] stations can only be placed at 
points p G L. 

3.3 Logarithmic Approximation Algorithm 

In the sequel we give an 0(log A)-approximation algorithm for Problem |4] by 
reducing it to the well-known problem SET-COVER, where N is the size of the 
input. Consider an input as in Problem 2] For each point p G L consider the 
square with radius R centered at p. The collection of these squares forms a planar 
subdivision of the rectangular region 72. Consider the bipartite graph (A, L) such 
that A is the set of planar rectangular subdomains thus formed. Moreover, for 
a G A and p G L, {a,p} is an edge if and only if the subdomain a lies entirely 
inside the square of radius R centered at p. Now observe that any solution of 
SET-COVER for the graph {A, L) corresponds to a solution of Problem and 
vice versa. In view of the fact that there are 0(log N) approximation algorithms 
for SET-COVER (e.g. the greedy algorithm 0) we obtain the following theorem. 
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Theorem 4. There is a linear time, logarithmic approximation algorthm for 
Problem [7] 



3.4 Constant Approximation Algorithm 

In this subsection we provide a polynomial time constant approximation algo- 
rithm for solving Problem H] From now on and for the rest of the paper that the 
radius of the stations is i? = 1. Our solution is via a reduction to the following 
problem. 

Problem 5 (Discrete rectangle problem). 

Input: A rectangle R with both height and width of length < 1, and a collection 
Z = {pi,P 2 , ■ ■ ■ ,Pn} of n points not necessarily all inside the rectangle. 
Output: The minimum number of unit squares with centers lying at the given 
points whose union covers the rectangle R. 

In particular, we will prove the following theorem. 

Theorem 5. There is a polynomial time, constant approximation algorithm for 
Problem [dj 

Before proving this theorem we indicate how it can be used to find a solution 
to the General discrete problem, i.e.. Problem |H We can prove the following 
theorem. 

Theorem 6. There is a polynomial time constant approximation algorithm for 
Problem^ where N is the size of the input. The constant is at most four. 

Outline of the Proof of Theorem 

We divide up the description of the proof into a classification of stations depen- 
ding on how the stations cover the rectangle. The resulting algorithm is recursive 
and is based on dynamic programming. The idea is as follows. We consider the 
“stations” centered at the given points. For a given rectangle R we consider all 
possible coverings of this rectangle by stations. We classify the square stations 
according to how they cover R, e.g. a square station may either cover R com- 
pletely, or the left-, right-, down-, up-side of R, or the left-down-, left-up-side, 
etc. It follows that the number of stations in an optimal solution is determined 
from solutions to other subrectangles. By scanning the solutions we can select 
the optimal solution to the rectangle R. 



Classification of the Min Size Cover 

Let R := R[x, x' , y, y'\ be the axis parallel rectangle with lower left corner {x, y) 
and upper right corner (x',y'), height Vr = y' — y and width Hr = x' — x. Let 
the given points he Z = {pi,P 2 , ■ • • ,Pn} and suppose that Ri denote the unit 
square centered at pi. We use the notation Ri = R[xf, xf, yf’ ,yY] for the station 
with lower left corner (ccf , y^) and upper right corner (x(^, yY)- We want to find 
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the minimum size subset P C Z such that the collection TZ{P) = {Ri : Pi G P} 
of squares covers the rectangle R. 

We now define the _R-Classification of stations. Given a rectangle R := 
R[x,x',y,y'] we classify the squares in TZ{{pi,p 2 , . . ■ ,Pn}) as follows using the 
notation C, L, R, U, D for Contains, Left, Right, Up, and Down, respectively. We 
also use the notation LD for the set LD D, i.e., 

C = {Ri Ri contains i?} 

L = {i?i : yf > y', yf <y,x <xf < x' , xf < x} 

LD = {Ri : y < yf < y', yf <y,xf < x,x < xf < x'} 

The other classes R, U, D and LU, RD, RU are defined similarly. The sets L and 
LD are depicted in Figure [1] Note that these sets are disjoint and their union is 
equal to R-iZ). 





R 




R 


Ri 






Ri 

















Fig. 1. ^-Classification: In the left picture Ri G L and in the right picture Ri G LD. 



Given TZ and R let C* {TZ, R) denote the minimum size cover of R by stations 
from TZ. We consider the following cases. Each case assumes that the previous 
case does not hold. With this in mind it is clear that the classification is complete, 
in the sense that C*{TZ, R) must belong to one of the cases below. 

Case 1. C fy 0. In this case \C*(TZ,R)\ = 1. 

Case 2L. L fy 0. In this case C*{TZ,R) contains exactly one Ri G L (namely 
the one farthest to the right which dominates all the other rectangles in L) (See 
Figure [ 21 ) 

Cases 2R, 2U, 2D. Similar. 

Next we consider the classes LD, LU, RD, RU. We study only Case 3LD. The 
other three cases are similar. 

Case 3LD. C*{TZ,R) contains at least two rectangles from LD. 

Let the squares of C* {TZ, R) fl LD be Ri^ , Ri^ , ■ ■ ■ , Ri^ ordered by ascending x- 
coordinate. Without loss of generality we may assume that they are also ordered 
by descending y-coordinate. Indeed, otherwise one of them, say Ri^ is dominated 
by the following square Rij+i in terms of its contribution to covering R, and 
hence it can be discarded. Let Ri,^ and Ri^ be the two rectangles of LD in 
C*{TZ,R), and let p be the upper right intersection point between 7?,;^ and 
P = {x, y) = (xf,yf) (see Figure 0 ). 

We note that the same observation as for Case 3LD, holds also for Cases 
3LU, 3RD, 3RU. The last case left is the following. 
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Fig. 2. Leftmost figure depicts Case 2L; middle figure depicts Case 3LD, and rightmost 
figure depicts Case 4. 
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Fig. 3. Clasification of C*{TZ, R). 



Case 4. C*{TZ, R) contains exactly one rectangle from each of LU, LD, RU, RD. 



Dynamic Programming Algorithm 

We are now in a position to use the above /^-Classification of squares in order 
to provide a dynamic programming algorithm computing the minimal number 
of squares in a covering. An optimal solution is constructed by recursion. The 
purpose of the previous classification is to establish the fact that all possible 
cases for the structure of C* were examined by the algorithm, and no possibility 
was omitted. Define the sets 

X = {x(; < xf, xf < : 1 < i < n} U {x^, x^} , 

Y = {yl^ < y^,vY < x^ : 1 < i < n} U {j/?,x^}, 

where Xq,Xq, yQ ,yg are the coordinates of the original rectangle. For any x,x' G 
X and y,y' G Y, let T(x, x' ,y,y') be the size of the minimum cover of the 
rectangle R[x,x' ,y,y'] by squares in TZ{Z). The procedure is the following. 

Procedure: 

Calculate T{x,x' ,y,y') for every x,x' G X and y,y' G Y by first order, i.e., 
calculating T(x, x', y, y') only after finishing all rectangles T(a, o', b, b') with both 

\a — a'\ < |x — x'l and |6 — b'\ < \y — y'\. In order to calculate T{x,x',y,y') 

for R = R[x,x',y,y'] and TZ, check systematically through all possibilities for 

c*{n,R). 
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Fig. 4. i?-Classification of C*{TZ, R). 



Case 1. If we are in Case 1, then there should be some Ri that contains R. This 
is checkable in time 0(n). 

Case 2L. In this case suppose L yf 0. Go through each Ri G L. For each of 
these, consult the table concerning the value ti = T{xf-,x' ,y,y'), which is the 
minimum coverage for R[xf^, x' , y, y']. If such an Ri exists then return ti + I. Of 
course it suffices to take the “most dominant” Ri G L, i.e., the one with greatest 

U^i . 

Cases 2R, 2U, 2D are similar, while Case 4 is easy. 

Case 3LD. From the observation we know that in this case we have Rk as in the 
rightmost picture depicted in Figure EJ Cycle through all choices of Ri ^ , Ri^ G 
LD and Rk G RU. If not in “right shape” ignore. Else 

t' G- T{R') 
t" G- T{R”) 

Reply fc) ^ t' + t" + 3 

Choose the best of 0{n^) replies Reply{ii,i 2 ,k). 

Cases 3LU, 3RD, 3RU are similar, while Case 4 is easy. Combining all these 
cases we obtain the general procedure for computing T{x,x' ,y,y') by selecting 
the best of all replies. This completes the outline of the proof of Theorem El ■ 

3.5 Conditional, Constant Approximation Algorithm 

In this section we provide a linear time, constant approximation algorithm when 
the buildings satisfy certain width contraints. 

Theorem 7. If there is no point p interior to a polygon such that D{p; R/2) 
lies entirely inside the interior of the polygon then there is a linear time approxi- 
mation algorithm for covering the region TZ whose number of squares is at most 
four times the optimal. 

We can improve on the constant four above as follows. The horizontal h 
(respectively, vertical v) width of an orthogonal polygon is the maximum length 
horizontal (respectively, vertical) line segment that lies inside the polygon. 



278 



P. Bose et al. 



Theorem 8. If either h < 2R or v < 2R then there is a linear time algorithm 
for finding a solution to Problem[M such that the number of stations is at most 
two times the optimal. 



3.6 Conclusion 

It is an open problem whether or not finding an optimal solution to our problem 
can be dome in polynomial time. Another interesting open problem arises when 
we consider an upper bound on the number of stations permitted to cover a 
given point in the region. (As Theorem [T] indicates, such a coverage may not 
always exist.) 

We note that the results of the paper are stated only for the Manhattan or Li 
metric. Similar algorithms and results are possible for the more realistic “hexa- 
gonal” metric. The only modification necessary is that the resulting constants in 
approximation algorithms are now derived using stations with hexagonal trans- 
mission range. Details will appear in the final version of the paper, 



References 

1. L. I. Aupperle, H. E. Conn, J. M. Kell, and J. O’Rourke, “Covering Orthogonal 
Polygons with Squares”, in proceedings of 26th Annual Allerton Conference on 
Coom. Contr, and Comp., Urbana, 28-30, Sep. 1988. 

2. R. Bar- Yehuda, and E. Ben-Chanoch, “An 0{N log* N) Time Algorithm for Cover- 
ing Simple Polygons with Squares” , in proceedings of the 2nd Canadian Conference 
on Computational Geometry, held in Ottawa, pp. 186-190, 1990. 

3. B. N. Clark, C. J. Colbourn, and D. S. Johnson, “Unit Disk Graphs”, Discrete 
Mathematics 86(1990) 165-177. 

4. T. H. Cormen, C. E. Leiserson, and R. L. Rivest, “An Introduction to Algorithms”, 
MIT Press, 1990. 

5. N. A. DePano, Y. Ko, and J. O’Rourke, “Finding Largest Equilateral Triangles 
and Squares”, Proceedings of the Allerton Gonference, pages 869 - 878, 1987. 

6. F. Gavril, “Algorithms for Minimum Coloring, Minimum Clique, Minimum Co- 
vering by Cliques, and Maximum Independent Set of a Chordal Graph”, SIAM J. 
Gomput., Vol. 1, No. 2, June 1972, pp. 180-187. 

7. D. S. Hochbaum and W. Maass, “Approximation Schemes for Govering and 
Packing Problems in Image Processing and VLSI”, J. ACM 32:130-138, 1985. 

8. H. B. Hunt, M. V. Marathe, V. Radhakrishnan, S. S. Ravi, D. J. Rosenkrantz, R. 
E. Stearns, “NC Approximation Schemes for NP- and PSPACE-Hard Problems for 
Geometric Graphs”, in Proceedings of 2nd ESA, pp. 468-477. 

9. J. C, Lin, “Biological Aspects of Mobile Communication Data” , Wireless Networks, 
pages 439-453, Vol. 3 (1997) No, 6. 

10. D. Moitra, ’’Binding a Minimal Cover for Binary Images: An Optimal Parallel 
Algorithm”, Algorithmica (1991) 6: 624-657. 

11. K. Pahlavan and A. Levesque, “Wireless Information Networks,” Wiley- 
Interscience, New York, 1995. 



Reverse Center Location Problem’^ 



Jianzhong Zhang^, Xiaoguang Yang^, and Mao-cheng Cai^’* ** 

^ Department of Mathematics, City University of Hong Kong, Hong Kong 
MAZHANGOcityu . edu . hk 

^ Institute of Systems Science, Academia Sinica, Beijing, China 
xgyang@iss04 . iss .ac.cn 

® Institute of Systems Science, Academia Sinica, Beijing, China 
caimcSbamboo . iss .ac.cn 



Abstract. In this paper we consider a reverse center location problem 
in which we wish to spend as less cost as possible to ensure that the di- 
stances from a given vertex to all other vertices in a network are within 
given upper bounds. We first show that this problem is NP-hard. We 
then formulate the problem as a mixed integer programming problem 
and propose a heuristic method to solve this problem approximately on 
a spanning tree. A strongly polynomial method is proposed to solve the 
reverse center location problem on this spanning tree. 

Keywords: networks and graphs, NP-hard, satisfiability problem, rela- 
xation, maximum cost circulation. 

AMS subject classification. 68Q25, 90C27. 



1 Introduction 

The center location problem, which is to find the “best” position of a facility in a 
network to minimize the distance from the facility to the farthest vertices of the 
network, is a very practical OR problem which has attracted much attention. 
The problem is well-solved, see, for example, [H]. A slight generalization of the 
center location problem is to find a vertex whose distances to other vertices do 
not exceed given upper bounds. This can be done by computing the distance 
matrix using the Floyd-Warshall algorithm, and comparing the entries of the 
matrix with the given upper bounds. 

In this paper, we consider the reverse problem of the generalized center lo- 
cation problem, that is, to modify the weights of a network optimally and with 
bound constraints, such that the distances between a given vertex and other 
vertices are bounded by given limits. Note that here a vertex is first selected and 
then we need to adjust the lengths to make the vertex become the solution of a 
generalized center location problem. That is why we call it a reverse problem. 

* The authors gratefully acknowledge the partial support of the Hong Kong Universi- 
ties Grant Council (CERG GITYU-9040189). 

** Research partially supported by the National Natural Science Foundation of Ghina. 



A. Aggarwal, C. Pandu Rangan (Eds.): ISAAC’99, LNCS 1741, pp. 279-|29^ 1999. 
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Obviously, to solve this problem, we only need to decrease, but never increase, 
some weights. 

We can describe the reverse center location problem (RCL) formally as fol- 
lows. Let G = {V, E,w,l,b,c) be a weighted undirected (directed) connected 
graph, where R is a vertex set, E is an edge (arc) set, w : E ^ i?+ is a weight 
or distance vector. Let s be a given vertex in V, and the vector I : R\{s} — ?► i?+ 
gives the upper bound for distances from other vertices to s. Let b : E ^ be a 
bound vector for the adjustment of weights which satisfies 6(e) < w{e), Ve G E. 
Let c : E ^ i?+ be the cost vector incurred by decreasing per unit of w. The 
reverse center location problem is to decrease w to w* such that 

(a) duj*{s,v) < l{v) holds under w* for all v G y\{s}. 

(b) w{e) — 6(e) < w*{e) for each e G E. 

(c) The total cost c(e)(w(e) — w*{e)) is minimum. 

eG E 

Note that in this paper we use d^{u, v) to represent the distance from vertex 
u to vertex v under weight w, which is equal to the length of the shortest path 
between u and v under weight w. 

It is straightforward to see that a given instance is feasible if and only if 
duj-b{s,v) < l{v) holds for all v G IZ\{s}. 

A closely related work was reported by Burton, Pulleyblank & Toint P.They 
consider such a problem: for m given pairs of vertices (si, U) and upper bounds 
Ui, modify w to w* such that dw*{si,ti) < Ui under w* and ^ {w{e) — w*(e))^ 

e€E 

is minimum. They proved that the problem is NP-hard, and gave an algorithm 
to find a local optimum solution. 

Let Si = s, {ti I i = 1,2, ■ ■ ■ , mj = P\{s}, then our reverse center location 
problem can be regarded as a special case of theirs if only the distance require- 
ment is concerned. But to make the reverse problem more reasonable, we use a 
weighted linear objective function to replace their 1-2 measure so that the cost 
coefficients on various edges can be different. Also, we impose restrictions on the 
magnitude of the adjustments. What we pursue in this paper is an approximate 
global optimum solution. Consequently, the method which we developed in this 
paper is independent of theirs. The paper is organized as follows. In Section 2, 
we try to show that the reverse center location problem is NP-hard. Then the 
problem is formulated as a mixed integer programming problem in Section 3 . In 
Section 4 we relax the integer requirement and then use a heuristic method to 
determine a spanning tree such that the problem can be treated approximately 
on the tree. In Section 5 we solve the dual problem of the approximate RCL pro- 
blem as a maximum cost circulation problem, and then in Section 6 we recover 
the primal approximate optimal solution from the dual optimal solution. Finally 
we summarize our method and give conclusions in Section 7. 

2 Complexity Analysis 

We now show that the reverse center location problem is NP-hard. The idea is 
first motivated from P^, but our proof is more elegant. 
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The SATISFIABILITY problem [T^ can be stated as follows: Given m clauses 
{Cl, C2, • • • , Cm} involving n Boolean variables {xi,X2, ■ ■ ■ , Xn}, does there exist 
a set of values of the variables (called a truth assignment) such that all clauses 
are true! 

If such a set of values exists, we say the SATISFIABILITY problem is sa- 
tisfiable. The SATISFIABILITY problem is the earliest natural NP-complete 
problem proven by Cook [S]. The technique which we use to show the NP- 
hardness of the reverse center location problem is a polynomial time reduction 
of the SATISFIABILITY problem into an instance of decision problem of the 
RCL problem. 

Theorem 1 Even if c(e) = const, l(v) = const and 6(e) = w{e), the reverse 
center location problem is NP-hard. 

Remark. This is the simplest case of the RCL model, and by this theorem, for 
general c, 1 , 6, the reverse center location problem is of course NP-hard. 

Proof. Given an instance of the RCL problem G = {V, E,w,l,b,c) and a number 
L, the decision problem of the RCL problem is whether there is a solution w* 
satisfying the conditions (a), (b) and 

(d) ^c{e){w{e)-w*{e)) <L . 

eeE 

We call this w* a feasible solution of the RCL problem within the budget limit 
L. 

Let us now construct a reduction from the SATISFIABILITY problem to an 
instance of decision problem of the RCL problem. 

Consider a SATISFIABILITY problem with n variables {x\,X2, - ■ ■ ,Xn} and 
m clauses {C\,C2, • • • , Cm}- Without loss of generality, we assume that, for each 
variable Xi and each clause Cj, Xi does not appear in Cj more than once, and 
Xi and Xi do not appear in Cj at the same time. 

Construct a graph G = (V, E) as follows: 

n 

— LJ {^i 5 '^i 1 '^i} U {^n-t- 1 ; in +2 5 l-nj-z} ^ }dj \ j — ^5 * * * 5 m} , 

i=l 

n 

C — U {(^^ ’ 1 ^ ^i) : (^j j i-ij-l ); (^jj^2-t-l){ U { (tyij-i , tjij-2) : (tn-t-1 ; ^71-1-3)} 

i=l 

n m 

u U U dj) I e Cj} u {(ui, qj) I x^ e Cj}) . 

i=ij=i 

In the graph. Boolean variable Xi corresponds to vertex Ui, and its negation Xi 
corresponds to vertex Vi. Clause Cj corresponds to vertex qj. Further, there is 
an edge {ui, qj) if and only if Xi € Cj, and (uj, qj) exists if and only if Xi € Cj. 
We call Ui and Vi literal vertices, qj clause vertices, the edges (ui,qj) or (vi,qj) 
clause edges, and other edges literal edges. 
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Let s = ti, l{v) = 1 for all v G F\{s}, c(e) = 1, and define a weight function 
on G as 



w(e) 



2n + 1 if e = (ui, qj) or (n*, qj) , 
1 otherwise . 



Now we claim that the SATISFIABILITY problem is satisfiable if and only if 
in the above network the RCL problem has a feasible solution whose modification 
cost is at most L = 2n{m + 1). 

First we assume that the SATISFIABILITY problem is satisfiable, then the 
literals with value true correspond to a path from ti to For example, 

if xi = X 2 = true, then the part between ti and t^ of the path should be 
(ti, ui), (ui, ^ 2 ), (^ 2 , U 2 ), (m 2 , ta)- Let us change the weights of the edges on this 
path to zero. For each clause vertex qj, choose one clause edge connecting it to 
a literal vertex corresponding to the true literal in Cj, and change the weight of 
this edge to one. It is easy to verify that this is a feasible solution of the reverse 
center location problem, and the modification cost is exactly 2n(m + 1). 

Conversely, suppose that the reverse center location problem has a feasible 
solution with a modification cost at most 2n{m+ 1). First, for each clause vertex 
qj, obviously the weight of one edge between a literal vertex and the qj must be 
decreased to one or less, for otherwise any path from t\ to qj will be longer than 
1. The cost for this reduction is at least 2n. It is easy to understand that for 
each qj we can have only one such edge, because even if only one of them has 
two such edges, then all 2n(m + 1) budget has to be spent on the clause edges 
and thus no path from t\ to tn+2 can have a length one or less. Denote by Q the 
set of such m edges for the m clause vertices. The total cost of these changes on 
Q is at least 2nm. 

For any path from ti to passing a clause vertex qj, the path must have 
two edges between the literal vertices and qj. Therefore, at least one of these 
two edges does not belong to Q. Furthermore, the length of this path is at least 
4n + 4, and in order to shorten the length of this path to not greater than one, 
the additional cost is at least 2n + 3. Since 2nm + (2n + 3) > 2n(m + 1), we 
can not reduce the length of this path to one within the budget limit. So, a path 
from t\ to tn+i, which can be modified to no longer than one, must consist of 
literal edges only. 

Now let us consider how the feasible solution can make the distance between 
ti and tn+2, and the distance between t\ to tn+i be not longer than one. Notice 
that all paths between t\ and tn+2 as well as between t\ and tn+i must pass 
through tn+i- So, we consider how to reduce the length between ti and tn+i 
first. Notice also that the remaining budget is at most 2n and the original length 
for any acyclic literal path between t\ and tn+i is 2n. Suppose we reduce the 
length of a path between t\ and tn+i to ^, and 0 < t' < 1, then the budget left 
is at most £. Then each of the two edges {tn+i,tn+2) and (tn+i,tn+i) must be 
shortened by at least £. That requires a cost of at least 2£ which is impossible. 
So, £ must be 0, i.e., the only possible way is to change a path from ti to tn+i, 
consisting of literal edges only, to be of zero length. Such a path consists of one 
and only one {Ef ,E~} for each z = 1, 2, • • • n, where Ef = {{ti, ui), {ui, tz+i)} 




Reverse Center Location Problem 



283 



and E~ = This path corresponds to a truth assignment of 

the SATISFIABILITY problem. That is, if Ef is used, we assign Xi the value 
true\ otherwise Xi = false. 

We now show that such a truth assignment can guarantee that each clause 
Cj is true. In fact, for any clause vertex qj, if all the literal vertices connecting to 
it are not on the above mentioned path from ti to tn+i, then the length of any 
path between ti and qj is at least 2, a contradiction. Hence we know that at least 
one literal vertex connecting to the clause vertex qj must be on the path. The 
literal vertex corresponds to a true literal of clause C^ , and hence clause Cj is 
true. Therefore, we conclude that the SATISFIABILITY problem is satisfiable. 

As the number of vertices is m + 3n + 3, and the number of edges does not 
exceed mn + 4n + 2, the transformation from the SATISFIABILITY problem 
to the decision problem of the reverse center location problem is a polynomial 
reduction. Thus the reverse center location problem is NP-hard. The proof is 
completed. □ 

The proof is in fact also true for directed graphs. But if we only consider 
directed graphs, the proof can be slightly simpler. 



3 A Mixed Integer Program Formulation 

Our next aim is to formulate the RCL problem as a mixed-integer linear program 
(MIP). 

Let T be a shortest path tree rooted at s under some weight w* , and denote 
by 7r(ti) the length of the path from s to on T, then it is straightforward to 
observe that 7 t(u) = dw*(s,v). Therefore under w*, dw*{s,v) < l{v) holds for all 
V G P\{s} if and only if the shortest path tree T with s as the root meets the 
condition 7t(z;) < l{v) for all v G P\{s}. 

^From Bellman’s equation we know that, T is a shortest path tree rooted 
at s under w* if and only if 



where the tree is oriented from the root to all leave vertices. 

For any spanning tree T which is rooted at s, we can use zero-one variables 
to distinct the edges on T or not, for example. 



7t(s) = 0 , 

tt(v) — 7t(u) = w*(e) Ve = (u,v) G T , 
7t(v) — 7t(u) < w*(e) otherwise , 



x(e) = I 



0 Ve e T , 

1 otherwise . 



^From the above analysis, we have 
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Theorem 2 Let (tt, x) be a feasible solution of the following system of inequa- 
lities and equations, 



' n{v) — tt(u) < w*(e) 

7t(v) — tt{u) + Mx{e) > w*{e) 
7t(s) = 0 , 

x{e) = \E\ + l-\V\, 

eG-E 

E 3^(e) > 1 

eec 

^ x(e) G {0,1} 



Ve = {u,v) G E , 

Ve = {u,v) G E , 

( 1 ) 

V cycle C , 

VeGE . 



Then T = {e G E\x{e) = 0} corresponds to a shortest path tree rooted at s under 
w* , where M is a sufficiently large number. 



Obviously, M = E w{e) is big enough, where w is the original weight vector. 

eeE 

By Theorem 2, the reverse center location problem can be formulated as the 
following MIP problem. 



min E c(e)0(e) 

eeE 

s.t. 



7t(v) — 7t(u) + 0(e) < w(e) 


Ve = (u, v) G 


7t(v) — Tr(u) + 0(e) + Mx(e) > w(e) 
7t(s) = 0 , 


Ve = (u, v) G 


tt(v) < l(v) 

E x(e) = \E\-\V\ + l, 

eGE 


Vu G P\{s} , 


E 2^(e) > 1 

eGC 


V cycle C , 


0 < 0(e) < 6(e) 


VeGE , 


x(e) G {0,1} 


\feGE . 



4 A Heuristic Algorithm 



Due to the above MIP formulation, we can design a heuristic which is based 
on a LP relaxation. First let us consider the linear programming relaxation of 
problem ®. That is, we replace the last constraint, i.e., the integer requirement: 
x{e) G {0, 1} by bounded continuous variables: 0 < x(e) < 1. However, since 
problem 0 includes one constraint for each cycle in G, the number of these 
constraints may be of an exponential order of \V\. To enumerate all the cycles 
is an exhausting job. So, in order to solve the LP relaxation quickly, we adopt a 
cutting plane method proposed by J. Leung m- 
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For a collection C of cycles in G, let RLP{C) be the linear program: 



min c(e)0(e) 

eeE 

s.t. 

tt{v) — 7t(u) + 0(e) < w(e) 

tt(v) — Tr(u) + 0(e) + Mx(e) > w(e) 

7t(s) = 0 , 

7t(v) < l(v) 

J2x(e) = \E\-\V\ + l, 

eG-E 

E x(e) > 1 
eec 

0 < 0(e) < 6(e) 

0 < x(e) < 1 



Ve = (u,v) G E , 
Ve = (u,v) G E , 

yv G y\{s} , 

ycGC , 

yeGE , 
yeGE . 



( 3 ) 



The following cutting plane method m can be used to solve the LP relaxa- 
tion problem 



Algorithm C: 

Step 1. C = 0. 

Step 2. Solve RLP(C), denote by x* the optimal solution. 

Step 3. Determine if there is a cycle C such that 

E x*(e) < 1 . 
eec 

Let C be the collection of such cycles. 

Step 4. If C = 0, stop, the LP relaxation is solved. Otherwise, let C ^ C U C', 
and return to Step 2. 

In Step 3, we can use the algorithm below, given by Grotschel, Jiinger and 
Reinelt 0, to find C'. 

Algorithm GJR: 

Step 1. C — % . 

Step 2. For each edge e = (u,v), using x* as a weight vector to compute the 
distance (u, v) in the subgraph G \ {e} . 

Step 3. For each pair u and u, if e = (v,u) G E and x*(e) + dx*(u,v) < 1, 
then let C be the cycle consisting of the edge e and the edges which 
realize the distance dx* (u, v) between u and u. If G ^ C', add G to C' . 

If the optimal solution x* of the LP relaxation of problem @ is integer- 
valued, this solution is indeed the optimal solution of the reverse center location 
problem and we are done. Otherwise, we need to go further to find a heuristic 
solution for the RCL problem. 

^From Theorem 2, we know that, the optimal solution of problem cor- 
responds to a shortest path tree, (we may call it the optimal shortest path 
tree), and we need to change only the weights on the tree, meanwhile keep other 
weights unchanged. Also, on this tree, the total value of x* equals zero which 
is of course the smallest total x* value in all s— rooted spanning trees. So, if we 
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set the optimal solution x* of the LP relaxation as the weight vector of E, it 
is reasonable to assume that the minimum spanning tree T (rooted at s) is a 
‘good’ approximation of the optimal shortest path tree. Then we can restrict 
our consideration on T only, solve problem ((2|) with T replacing E (we call it 
the approximate RCL problem), and take its solution as a heuristic solution of 
problem ([2|). 

5 The Dual Problem and Its Solution 

When we substitute E with T, problem (|2J is reduced to the following linear 
program. 



min Yh c(e)0(e) 

eeE 

s.t. 

7r(v) — 7 t(u) + 0(e) = w(e) Ve = (u,v) € T , ('4') 

7t(s) = 0 , 

'^(v) < l(v) 'iv G P\{s} , 

0 < 0(e) < 6(e) Ve G T . 

Although problem (j3|) can be solved by any LP method, we can present a 
strongly polynomial combinatorial algorithm with complexity 0(|Pp) to solve 
it. 

We can regard T as a directed tree with the orientation from the root s to 
all leave vertices. For each v G V, let P(v) denote the path from s to z; on the 
tree T, P(u,v) the subpath of P(v) starting at u for any vertex u on P(v) and 
•= SeGP(i;) ~ A vertex v is called a violating vertex if r(v) > 0, 
i.e., the distance from s to f under the original weight vector w is greater than 
the upper bound l(v). Denote hy V* (ZV the set of violating vertices. Then the 
problem ([3D can be re-written as 

min Y c(e)0(e) 

eeE 

Y ^(e) > r(v) 'iv gV* , 

e^P{v) 

0 < 0(e) < 6(e) Ve G r , 

which is the approximate reverse center location problem we are going to solve. 

The dual of problem m can be written as 

max X) r(v)y(v) - Y Ke)z(e) 
vev* eeE 



s.t. 



Y{y{'^) I 5 e, V G V*} — z(e) < c(e) 'ie gT , 
y(v) >0 WvGV* 

z(e) >0 Me GT . 



( 6 ) 
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Without loss of generality, we may assume that all leave vertices of T are 
violating vertices, for otherwise we can remove such vertices and the incident 
arcs from T and let the values of 0 on these arcs be zero. 

Now construct an auxiliary network N = (y+, c+, fc+) of T, in which 

t/+ = y u {t} is the vertex set of iV, A+ = {e^ = (u, v),e^ = (u, v) \ e = {u, v) G 
T}U{(w,f) I V G y*}U{(t, s)} is the arc set of N. c+ and are capacity vector 
and cost vector of N respectively, which are defined as follows. 



r.+ (e) = Ve = = (m,u) G A+ , 

" ^ ^ +00 otherwise . 



0 

k~^{e) = —b{u, v) 

r{v) 



Ve = G A+ or e = {t, s) 

Ve = = (u, v) G , 

Ve = (u, t) G . 



Note that each e G T corresponds to two arcs and in N, and we 
introduce an arc from t to s with an infinite capacity and zero cost. We consider 
the maximum cost circulation problem on N and can show that 



Theorem 3 The problem @ is equivalent to finding a maximum east circula- 
tion on N. 



Generally speaking, to find a maximum cost flow on a network can be done 
by some strongly polynomial combinatorial algorithms mm, but due to the 
special structure of TV, we can And a maximum cost circulation on N much faster 
by the following simple algorithm. Note that in stead of obtaining the flow /, 
the algorithm gives directly the optimal solution {y, z) of problem ® . 



Algorithm F. 

Step 0. Let y{v) = 0 for all v G V*, and z{e) = 0 for all e G T. 

Step 1. Find a maximum cost path P from s to t on N. 

Step 2. Let k^{P) be the cost of P. If fc+(P) < 0, stop, the current (y, z) is the 
optimal solution of problem di- 

step 3. If k^{P) > 0, let e+ = argmin{c+(e) | e G P}. If c+(e+) = +oo, stop, 
problem di is unbounded and thus problem 0 is infeasible. Otherwise 
go to Step 4. 

Step 4. Let u G F* be the precedent vertex of t on the path P. Set y{v) ^ 
y{v) + c+(e+). 

Step 5. For each e G T with the corresponding or on P, if G P, then 
set z(e) ^ z(e) + c+(e+); if G P, then set c+(e^) ^ c+(e^) — c+(e+). 
If c+(e^) = 0, delete from N. Return to Step 1. 

Since IV \ {(t, s)} is acyclic, we can And the maximum cost s — t path in 
0(|A+|)(= 0(|y|)) operations. Moreover, at each iteration, at least one arc with 
finite capacity is deleted (unless = +oo which terminates the algorithm), 

hence there are at most |y| — 1 iterations since the number of arcs with finite 
capacity is |y| — 1. Therefore, we can conclude that Algorithm F runs in 0(|yp) 
operations. 
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6 Recovering the Primal Optimal Solution 

Now we consider how to recover an optimal solution of the primal problem m 
from an optimal solution of the dual problem (O- Let {y*,z*) be the optimal 
solution of the dual problem d5| . Then the primal problem m also has an optimal 
solution, denoted by 9. In fact as we already mentioned, the components of 6 
which correspond to the edges not on T are zero. So, we only need to determine 6 
on T . By the complementary slackness theorem of linear programming, 9* G 
is an optimal solution of if and only if 



9*{e) = r{v) 'iv G V* and y*{v) > 0 , (7) 

e£P{v) 

9*(e) > r{v) Vu G V* and y*{v) = 0 , (8) 

e^P{v) 

0 < 9*{e) < b(e) Ve G T , (9) 

9*{e) = 0 Ve G T and U*{v) — z*{e) < c(e) , (10) 

P{v)^e 

9*{e) = 6(e) Ve G T and z*{e) > 0 . (11) 



Now we try to solve the above inequality system. Due to its special structure, 
we are able to present an algorithm which can solve the problem in 0(|1Z| log(|17|)) 
operations. 

For each edge e satisfying one of the strict inequalities in (HOI) and cu , we 
can have a pre-process as follows. 

If e = (u, v) G T satisfies the strict inequality in (EoD, then we can set 
9*{e) = 0, and contract the arc to vertex u and set r{u) G- max{r(u), r(u)}. It 
is easy to see that the feasible solution of the resulting system corresponds to a 
feasible solution of the original system, and vice versa. Similarly, if e = (u, v) G T 
satisfies the strict inequality in ini), we can set 9*{e) = 6(e) and then contract 
the arc to vertex u, set r{u) G- max{r(u), r(u) — 6(e)}, and set r{q) G- r{q) — 6(e) 
for any successive vertex q of v. 

After this pre-process, we can assume that there is no arcs on T satisfying 
one of the two strict inequalities in m and JUJ. 

Let = {v G V* I y*{v) > 0}. We are now ready to present the algorithm 
to solve the inequality system. 

Algorithm R. 

Step 0. Sort the vertices in V“ into V\,V 2 , ■ ■ ■ ,Vk (renumbering if necessary) 
by the nondecreasing order of r(v) with the additional condition that 
if r(vi) = r{vj) and Vi is on P{vj), then i < j. 

Set T( = {s}, p{s) = 0 and j = 1. 

Step 1. If j = fc -I- I, put 9*{e) = 6(e) for all arcs e G stop. The optimal 

solution has been obtained. 

Step 2. Scan vertex Vj. That is, let s' be the last vertex of P{vj) belonging to 
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Tj. Along the path P{s', Vj), one arc by one arc, say e = (rt, v) being 
the current arc, let 

p{v) ^ min{p(it) + b{e),r{vj)} (12) 

and 0*{e) = p{v) —p{u). 

Step 3. Let ^ T- U P{s', Vj) and jA— j + 1. Return to Step 1. 



Theorem 4 The vector 9* generated by Algorithm R meets conditions (17)- (II 111 . 
and thus is an optimal solution to problem 

Let us analyze the complexity of Algorithm R now. It is easily seen that the 
ordering of the vertices in R® can be done in 0(|R®| log(|R®|) operations, and 
each vertex is handled only once. Notice that the ordering of the vertices in R® 
dominates the computational complexity, and hence the total complexity is at 
most 0(|R| log(|R|)). 

7 Conclusions 

We now summarize our complete heuristic algorithm for solving the reverse cen- 
ter location problem. 

Algorithm H. 

Input: G = {V,E,w), s,l,b . 

Output: w* . 

Step 1. Employ Algorithm C to find the optimal solution x* of the linear 
programming relaxation of problem ©• 

Step 2. Use x* as weights on G to find a minimum spanning tree T on G. 

Step 3. Use Algorithm F to obtain a dual optimal solution (y*,z*). 

Step 4. Use Algorithm R to recover the optimal solution 9* of problem ©. 
Then w* = w — 9* is an approximate solution of problem ISD- 

Let us now return to the general case of inverse optimization. There are 
several types of inverse problems to be put forward. The first type is that a 
feasible solution is given, and we need to adjust the weights as less as possible 
such that the given one becomes an optimal solution. So far this type of inverse 
problems receives most attention, see [2I3ITT] , un-Hz]. This type of inverse 
problems is often polynomially solvable if the original problems is so (a general 
result can be seen in [19]) • In fact strongly polynomial methods are often available 
if inverse network problems are concerned. However, there are still exceptional 
cases in which although the original problems are polynomially solvable, its first 
type inverse problem is NP-hard. See for example [4j. 

The second type of inverse problems does not require the given feasible solu- 
tion to become an optimal solution, or no particular feasible solution is specified. 
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and we only require that after the adjustment of weights the new optimal so- 
lution ensures that its corresponding optimal value meets some given bound 
requests. The problems discussed in this paper and in [1] are of this type. The 
results of these two papers indicate that this type of inverse problems are often 
NP-hard (to distinct the two types of problems verbally, we call the first type of 
problems inverse problems and the second type reverse problems). So, to solve 
this type of problems is more challenging and remains to be one of the major 
issues in the study of inverse optimization problems. Hopefully, recent progress 
in developing approximation algorithms for NP-hard problems, see for example 
PI, shall help solve this type of inverse problems. 

In fact there is also even more difficult type of inverse problems, see for 
example [7]. Indeed, inverse optimization problem is still a widely open and 
dynamic field. 
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Appendix A: Proof of Theorem 3 

First, suppose / is a maximum cost circulation on N, let us define y{v) = f{v, t) 
for any v € V*, and z(e) = /(e^) for any e € T, then we can verify that (y, z) is 
a feasible solution of (O- Also, the maximum cost is 

e^A+ veV eeE 

which is just the objective value of problem at the feasible solution {y,z). 
Conversely, if (y, z) is an optimal solution of (0, we have 

z{e) = max{0, ^{y(t’) | P{v) 3 e,u £ V*} — c(e)} . 

Define 

= y{v) 

/(e^) = min{c{e),J2{y{i>) I Pi^) 3 e,u £ V*}} 

/(e^) = z{e) 
f {t,s)= E . 

vev* 

As for any e £ T, /(e^) < c(e), the capacity requirement is met. Also, no matter 
c(e) < E always have that for each e £ T, the flow 

/(e^) + /(e^) = I 5 e,veV*} . 

from which it is easy to see that / is a circulation on N, and the total cost of 
the flow / equals the maximum objective value of ©■ 

Therefore, the two problems have the same optimum value, and the optimal 
solution of problem can be obtained easily once we solved the maximum cost 
circulation problem. The theorem is proved. □ 



Vu £ y* , 
Ve£T, 
Ve £ T, 



Appendix B: Remarks for Algorithm R 

Remark 1. In the algorithm we obtain 9* by treating the tree T piece by 
piece. After scanning a Vj in k®, the arcs on P(s',Vj) have been assigned their 
9* values. p{v) represents the total decrement of the weights on P{s,v), and T' 
stands for the subtree which has already been processed after scanning the first 
j — 1 vertices vi, . . . It is possible that after scanning all vertices in k®, 

is still a proper subset of T. Under such case, we assign 9*(e) = 6(e) for all 
arcs e £ T\T^. 

Remark 2. ^Trom the algorithm, it is easy to see that 

9*(e) = p(v) Vv G V . 

eeP(v) 
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Remark 3. An arc e G is called saturated if 0*{e) = &(e) and unsaturated 
otherwise. Suppose e = {u,v) G Tj, then the value 9*{e) is assigned when one of 
{ui, . . . , Vj-i} is scanned. Say, the arc is processed when we scan Vi, (i < j — 1). 
Then P(vi) passes through e. If this arc is unsaturated, then by the formula 
m, we have p{v) = r{vi) which means for any vertex x on the path P(v,Vi), 
p{x) = r{vi), and thus for any arc along P{v,Vi), the value of 9* must be zero. 



Appendix C: Proof of Theorem 4 

As we assume a pre-process has been taken before the Algorithm R is employed, 
we only need to consider CZD-0. 

If A: = 0, i.e., R® = 0, then (jT]) does not occur, and by Step I, 9*{e) = 6(e) for 
all arcs e of T, which means m and m hold. So, in the proof below, we assume 
k>0. 

We now show that for all arcs and vertices which are processed when we scan 
Vj, j = 1, . . . ,k, they must satisfy ©-©• 

We consider any j < k. Recall that s' is the last vertex of P{vj) belonging 

tor;. 

We begin with proving dS). For each e G P{s', vj), by Step 2, of course 9*{e) < 
6(e). To prove 9*{e) > 0 we only need to show that p{s') < r{vj). Assuming p{s') 
is defined when we scan Vi, then i < j, implying p(s') < r{vi) < r{vj). Thus (|3 
holds. 

Next let us show Oz). By formula ^2^,p{vj) < r{vj). Assuming p(uj) < r{vj), 
then there exists at least one unsaturated arc on P{s'). Let e = (u,v) be the 
last such arc. Then by Remark 3 there exists Vi G V, i < j, such that P{vi) 
passes through e and p{v) = r{vi). Let 9 be the optimal solution of problem 
which we mentioned at the beginning of this section, and thus for any Vj G R®, 
9 satisfies the inequality in (0. So, we have 

eeP(vj) 

= E ^(e)+ E 

e€P(v) e€P(v,Vj) 

< E E 

e€P(vi) eGP(v,Vj) 

= + E 

eGP(v,Vj) 

= p(v)+ He) 

eGP(v,Vj ) 

= p(vj) , 



a contradiction. Hence p(vj) = r(vj), i.e., l(7D is true for every Vj G R®. 
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We then turn to prove ((8l). For each vertex v' G V* on P{s',Vj) \ {s'} (s' 

is not processed in the j-th scan), if all arcs of P{v') are saturated, then due to 
the feasibility of the inverse problem, clearly (jS) holds. Otherwise let e' = {u,v) 
be the last unsaturated arc on P{v'). There are only two possibilities. First, if 
e' is on P{s',Vj), then 

p{v') = p{vj) = r{vj) = 9{e) > Y > r{v') . 

e£P{vj) e£P{v') 

That means u' satisfies the inequality in (|8]). Second, if e' is on P{s'), then again 
by Remark 3, there exists Vi gV^, i < j, such that P{vi) passes through e' and 
p{v) = r{vi). We have 

r{v') < Y 

e€P{v') 

= E E 

e£P{v) e£P{v,v') 

< E E 

e^P{vi) e^P{v,v') 

= r{vi) + Y 

e^P{v ,v') 

=p{v)+ Y 

eGP{v,v') 

= pW) ■ 

So, l(8|) always holds for any vertex v' G V* which has been processed when we 
scan Vj. 

Finally we consider the arcs and vertices which are not contained in 
i.e., the case j = fc + 1. By Step 1, all arcs e G T\T{ have 0*{e) = 6(e) and thus 
(0 holds. 

For all vertices q G V* n{^\^fc}> there are two possible cases. First, there is 
a vertex G R® on P{q). Let v* be the last such vertex. Then 

r(q) < Y 

e^P{q) 

< Y + E 

e^P{v*) e^P{v* ,q) 

= r{v*) + Y 

eeP(v*,q) 

= Y 

eeP{q) 

i.e., ® is true for this vertex. Second, there is no such vertex. In this case m 
holds trivially as all arcs on P{q) are saturated. 

The proof of the theorem is completed. □ 
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Abstract. It is of interest in cryptographic applications to obtain prac- 
tical performance improvements for the discrete logarithm problem over 
prime fields Fp with p of size < 500 bits. The linear sieve and the cu- 
bic sieve methods described in Coppersmith, Odlyzko and Schroeppel’s 
paper P] are two practical algorithms for computing discrete logarithms 
over prime fields. The cubic sieve algorithm is asymptotically faster than 
the linear sieve algorithm. 

We discuss an efficient implementation of the cubic sieve algorithm in- 
corporating two heuristic principles. We demonstrate through empirical 
performance measures that for a special class of primes the cubic sieve 
method runs about two to three times faster than the linear sieve method 
even in cases of small prime fields of size about 150 bits. 



1 Introduction 

Computation of discrete logarithms over a finite field is a difficult problem. 
No algorithms are known that solve the problem in time polynomially bounded 
by the size of the field (i.e. logg). The index calculus algorithm 1317191101111 is 
currently the best known algorithm for this purpose and has a sub-exponential 
expected running time given by 

L{q,J,c) = exp ((c-b o(l))(logg)'^(loglogg)^"^) 

for some constant c and for some real number 0 < 7 < 1. For practical applica- 
tions, one typically uses prime fields or fields of characteristic 2. In this paper, 
we focus on prime fields only. 

Let Fp be a prime field of cardinality p. For an element a G Fp, we denote by 
a the representative of a in the set {0,l,...,p— 1}. Let g be a primitive element 
of Fp (i.e. a generator of the cyclic multiplicative group Fp). Given an element 
a G Fp , there exists a unique integer 0 < x < p — 2 such that a = in Fp. This 
integer x is called the discrete logarithm or index of a in Fp with respect to g 
and is denoted by indg(a). The determination of x from the knowledge of p, g 
and a is referred to as the discrete logarithm problem. In general, one need not 
assume g to be a primitive element and is supposed to compute x from a and 
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g, if such an x exists (i.e. if a belongs to the cyclic subgroup of FjJ generated by 
g). In this paper, we always assume for simplicity that g is a primitive element 
of Fp. 

In what follows we denote by L{p, c) any quantity that satisfies 

L{p,c) =L(p, 1/2, c) = exp ^(c + o(l))\/lnplnlnp^ , 

where c is a positive constant and In a: is the natural logarithm of xQ When p 
is understood from the context, we write L[c] for L(j>,c). In particular, L[l] is 
denoted simply by L. 

The nai ve index calculus algorithm [TOl Section 6.6.2] for the computation of 
discrete logarithms over prime fields and the adaptations of this algorithm take 
time L[c] for c between 1.5 and 2 and are not useful in practice for prime fields Fp 
with p > 2^^^. Coppersmith, Odlyzko and Schroeppel [3] proposed three variants 
of the index calculus method that run in time L[l] and are practical for p < 

A subsequent paper [Zj by LaMacchia and Odlyzko reports implementation of 
two of these three variants, namely the linear sieve method and the Gaussian 
integer method. They were able to compute discrete logarithms in Fp with p of 
about 200 bits. 

The paper [3] also describes a cubic sieve algorithm due to Reyneri for the 
computation of discrete logarithms over prime fields. The cubic sieve algorithm 
has a heuristic running time of L[^/^] for some | < a < | and is, there- 
fore, asymptotically faster than the linear sieve algorithm (and the other L[l] 
algorithms described in [3]). However, the authors of [3] conjectured that the 
theoretical asymptotics do not appear to take over for p in the range of practical 
interest (a few hundred bits) . A second problem associated with the cubic sieve 
algorithm is that it requires a solution of a certain Diophantine equation. It’s 
not known how to find a solution of this Diophantine equation in the general 
case. For certain special primes p a solution arises naturally, for example, when 
p is close to a whole cube. 

Recently, a new variant of the index calculus method based on general number 
field sieves (NFS) has been proposed and has a conjectured heuristic run time 
of 

L(p, 1/3, c) = exp (^(c-k o(l))(logp)3(loglogp)3^ . 

Weber et. al. |12| 14|15| have implemented and proved the practicality of this me- 
thod. Currently the NFS-based methods are known to be the fastest algorithms 
for solving the discrete logarithm problem over prime fields. 

In this paper, we report efficient implementations of the linear sieve and the 
cubic sieve algorithms. To the best of our knowledge, ours is the first large-scale 
implementation of the cubic sieve algorithm. In our implementation, we employ 
ideas similar to those used in the quadratic sieve algorithm for integer factoriza- 
tion i W3| . Our experiments seem to reveal that the equation collecting phase 
of the cubic sieve algorithm, whenever applicable, runs faster than that in the 
linear sieve algorithm. 



^ We denote log x = logj^Q a:, In a; = log^ x and Ig a; = log2 x. 
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In the next two sections, we briefly describe the linear sieve and the cubic 
sieve algorithms. Performance of our implementation and comparison of the two 
algorithms for a randomly chosen prime held are presented in Sections 4 and 5. 
Our emphasis is not to set a record on the computation of discrete logarithms, 
but to point out that our heuristic principles really work in practical situations. 
We, therefore, experimented with a small prime (of length around 150 bits). 
Even for this held we get a performance gain between two and three. For larger 
prime fields, the performance improvement of the cubic sieve method over the 
linear sieve method is expected to get accentuated. We conclude the paper in 
Section 6. 

2 The Linear Sieve Algorithm 

The first stage for the computation of discrete logarithms over a prime held 
Fp using the currently known subexponential methods involves calculation of 
discrete logarithms of elements of a given subset of Fp, called the factor base. 
To this end, a set of linear congruences are solved modulo p — 1 . Each such 
congruence is obtained by checking the factorization of certain integers computed 
deterministically or randomly. For the linear sieve algorithm, the congruences 
are generated in the following way. 

Let H = + 1 and J = — p. Then J < 2y^. For small integers ci, C2, 

the right side of the following congruence (henceforth denoted as T(ci,C2)) 

{H + ci){H + C2) = J + {ci + C2)H + C1C2 (modp) (1) 

is of the order of If the integer T(ci, C2) is smooth with respect to the first t 
primes gi, <72, • • • , 9t, that is, if we have a factorization like J + {ci + C2) H + C1C2 = 
rii=i 9^% then we have a relation 



t 

mdg{H + Cl) + indg(iL + C 2 ) = E a* indg ((?*). (2) 

i=l 

For the linear sieve algorithm, the factor base comprises of primes less than 
L[l/2] (so that by the prime number theorem t « L[l/2]/ ln(L[l/2])) and integers 
H + c for —M < c < M. The bound M on c is chosen such that 2 M « L[l/2 + e] 
for some small positive real e. Once we check the factorization of T(ci,C2) for 
all values of Ci and C2 in the indicated range, we are expected to get L[l /2 + 3e] 
relations like (2) involving the unknown indices of the factor base elements. If 
we further assume that the primitive element g is a small prime which itself is in 
the factor base, then we get a relation indp(g) = 1. The resulting system with 
asymptotically more equations than unknowns is expected to be of full rank and 
is solved to compute the discrete logarithms of elements in the factor base. 

In order to check for the smoothness of the integers T(ci,C2) = J + (ci + 
C2)H + C1C2 for Cl, C2 in the range — M, . . . , M, sieving techniques are used. First 
one Axes a ci and initializes to zero an array 21 indexed — M, . . . , M . One then 
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computes for each prime power {q is a small prime in the factor base and h is 
a small positive exponent), a solution for C2 of the congruence {H + ci)c2 + ( J + 
CiH) = 0 (mod q^). If the gcd {H + ci,q) = 1, i.e. ii H + ci is not a multiple 
of q, then the solution is given hy d = — (J + ciH){H + c\)~^ (mod q^). 
The inverse in the last equation can be calculated by running the extended 
gcd algorithm on + ci and q^ . Then for each value of C2 (— M < C2 < M) 
that is congruent to d (mod Igq is addec0 to the corresponding array 
locations 2lc2- the other hand, if q^^\\{H + ci) with hi > 0, we compute 
/i2 > 0 such that q^^\\{J + CiH). If h\ > /12, then for each value of C2, the 
expression T(ci,C2) is divisible by q^'^ and by no higher powers of q. So we add 
the quantity h2^^q to Slcj for all —M < C2 < M. Finally, if hi < /12, then we 
add hi In q to 2tc2 for all —M < C2 < M and for h > hi solve the congruence as 

(nrodg'^-"^). 

Once the above procedure is carried out for each small prime q in the factor 
base and for each small exponent hjl we check for which values of C2, the entry 
of 21 at index C2 is sujficiently close to the value Ig (T(ci, C2)). These are preci- 
sely the values of C2 such that for the given Ci, the integer T (01,02) factorizes 
smoothly over the small primes in the factor base. 

In an actual implementation, one might choose to vary ci in the sequence 
—M, —M + 1, —M + 2 , . . . and, for each Ci, consider only the values of C2 in the 
range ci < C2 < M . The criterion for ‘sufficient closeness’ of the array element 
2tc2 and Ig (T(ci, C2)) goes like this. If T(ci, C2) factorizes smoothly over the small 
primes in the factor base, then it should differ from 2lc2 by a small positive or 
negative value. On the other hand, if the former is not smooth, it would have a 
factor at least as small as qt+i, and hence the difference between lg(T(ci,C2)) 
and 2tc2 would not be too less than Igqt+i. In other words, this means that the 
values of the difference lg(T(ci,C2)) — 2lc2 for smooth values of T(ci,C2) are 
well-separated from those for non-smooth values and one might choose for the 
criterion a check whether the absolute value of the above difference is less than 1. 

This completes the description of the equation collecting phase of the first 
stage of the linear sieve algorithm. This is followed by the solution of the linear 
system modulo p — 1. The second stage of the algorithm involves computation of 
discrete logarithms of arbitrary elements of Fp using the database of logarithms 
of factor base elements. We do not deal with these steps in this paper, but refer 
the reader to [SEE] for details. 



^ More precisely, some approximate value of Igg, say, for example, the integer 
[1000 IgqJ. 

^ The exponent h can be chosen in the sequence 1,2,3,... until one finds an h for 
which none of the integers between —M and M is congruent to d. 
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3 The Cubic Sieve Algorithm 



Let us assume that we know a solution of the Diophantine equation 

= Y‘^Z {xnodp) (3) 

Y^Z 

with X, Y, Z of the order of for some | < a < ^- Then we have the congruence 



{X + AY){X + BY){X + CY) = 



y2 



Z + {AB + AC + BC)X + {ABC)Y 



(mod p) 



(4) 



for all triples {A, B, C) with A + B + C = Q.li the bracketed expression on the 
right side of the above congruence, henceforth denoted as R{A, B, C), is smooth 
with respect to the first t primes q\,q 2 , ■ ■ ■ ,qt, that is, if we have a factorization 
R{A, B, C) = rii=i then we have a relation like 

indg(X + AY) + indg(X + BY) + indg(X + CY) = 

2 indg (y ) + i’^dg (qi) (mod p - 1) (5) 

If A, B, C are small integers, then R{A,B,C) is of the order of p°‘, since each 
of X, Y and Z is of the same order. This means that we are now checking 
integers smaller than 0{p^) for smoothness over first t primes. As a result, we 
are expected to get relations like (5) more easily than relations like (2) as in the 
linear sieve method. 

This observation leads to the formulation of the cubic sieve algorithm as 
follows. The factor base comprises of primes less than L[^Ja/2] (so that t « 

L[^Ja/2]/\a (^L[^Ja/2^), the integer Y (or Y"^) and the integers X + AY for 

0 y |A| < M, where M is of the order of L[y/a/2], The integer R{A,B,C) is, 
therefore, of the order of p“L[i/3a/2] and hence the probability that it is smooth 
over the first t primes selected as above, is about L[—^/aj2]. As we check the 
smoothness for L['/^] triples (A,B,C) (with A + B + C = 0), we expect to 
obtain L[yJa/2] relations like (5). 

In order to check for the smoothness of R{A, B,C) = Z + {AB + AC + 
BC)X + {ABC)Y over the first t primes, sieving techniques are employed. We 
maintain an array 2t indexed —M... + M as in the linear sieve algorithm. At 
the beginning of each sieving step, we fix C, initialize the array 21 to zero and let 
B vary. The relation A + B + C = 0 allows us to eliminate A from R{A, B, C) 
as R{A, B, C) = —B{B + C){X + CY) + {Z — C'^X). For a fixed C, we try to 
solve the congruence 



B{B + C){X + CY) + {Z- C^X) = 0 (mod q^) 



(6) 
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where g is a small prime in the factor base and is a small positive exponent. This 
is a quadratic congruence in B. li X + CY is invertible modulo (i.e. modulo 
q), then the solution for B is given by 

B = -^ + ^{X + CY)-\Z-C^X) + ^ (mod (7) 

where the square root is modulo q^. If the expression inside the radical is a 
quadratic residue modulo q^, then for each solution d oi B in (7), Igg is added 
to those indices of 21 which are congruent to d modulo q^. On the other hand, if 
the expression under the radical is a quadratic non-residue modulo q^, we have 
no solutions for B in (6). Finally, if X -|- CY is non-invertible modulo q, we 
compute hi > 0 and /12 > 0 such that q’^'^\\(X + CY) and q'^^lKZ — C^X). If 
hi > / 12 , then R{A, B, C) is divisible by q^^ and by no higher powers of q for each 
value of B (and for the fixed C). We add /i 2 Ig q to 21^ for each — M <i<M. On 
the other hand, if hi < / 12 , we add hi Igq to 2ti for each —M < i < M and try 
to solve the congruence —B{B + C) — 0 (mod for 

h> hi. Since is invertible modulo this congruence can be solved 

similar to (7). 

Once the above procedure is carried out for each small prime q in the factor 
base and for each small exponent h, we check for which values of B, the entry of 
21 at index B is sufficiently close to the value lg(i?(A, B,C)). These are precisely 
the values of B for which R{A, B, C) is smooth over the first t primes for the 
given C. The criterion of ‘sufficient closeness’ of 21^ and lg{R{A, B,C)) is the 
same as described in connection with the linear sieve algorithm. 

In order to avoid duplication of effort, we should examine the smoothness of 
R{A, B,C) for —M < A < B < C < M. With this condition, it can be easily 
shown that C varies from 0 to M and for a fixed C, B varies from —Cj2 to 
min(C, M — C). Though we do not use the value of A directly in the sieving 
procedure described above, it ’s usefufl to note that for a fixed C, A varies from 
max(— 2C, — M) to —Cj2. In particular, A is always negative. 

After sufficient number of relations are available, the resulting system is 
solved modulo p — 1 and the discrete logarithms of the factor base elements are 
stored for computation of individual discrete logarithms. We refer the reader to 
mm for details on the solution of sparse linear systems and on the computation 
of individual discrete logarithms with the cubic sieve method. 

Attractive as it looks, the cubic sieve method has several drawbacks which 
impair its usability in practical situations. 

1. It is currently not known how to solve the congruence (3) for a general p. 
And even when it is solvable, how large can a be? For practical purposes a 
should be as close to ^ as possible. No non-trivial results are known to the 
authors, that can classify primes p according as the smallest possible values 
of a they are associated with. 



4 



for a reason that will be clear in Section 5 
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2. Because of the quadratic and cubic expressions in A, B and C as coefficients 
of X and Y in R{A, B,C), the integers R{A,B,C) tend to be as large as 

even when a is equal to 1/3. If we compare this scenario with that for 
T{c\,C 2 ) (See Equation (1)), we see that the coefficient of 7L is a linear fun- 
ction of Cl and C 2 and as such, the integers T(ci, C 2 ) are larger than by a 
small multiplicative factor. This shows that though the integers R{A,B,C) 
are asymptotically smaller than the integers T(ci,C 2 ), the formers are, in 
practice, around 10^-10® times smaller than latter ones, even when a assu- 
mes the most favorable value (namely, 1/3). In other words, when one wants 
to use the cubic sieve algorithm, one should use values of t (i.e. the number 
of small primes in the factor base) much larger than the values prescribed 
by the asymptotic formula for t. 

3. The second stage of the cubic sieve algorithm, i.e. the stage that involves 
computation of individual logarithms, is asymptotically as slow as the equa- 
tion collection stage. For the linear sieve algorithm, on the other hand, indi- 
vidual logarithms can be computed much faster than the equation collecting 
phase. 

In this paper, we address this second issue related to the cubic sieve algo- 
rithm. We report an efficient implementation of the cubic sieve algorithm for the 
case a = 1/3, that runs faster than the linear sieve method for the same prime. 
Our experimentation tends to reveal that the cubic sieve algorithm, when ap- 
plicable, outperforms the linear sieve method, even when the cardinality of the 
ground field is around 150 bits long. 

4 An Efficient Implementation of the Linear Sieve 
Method 

Before we delve into the details of the comparison of the linear and cubic sieve 
methods, we describe an efficient implementation of the linear sieve algorithm. 
The tricks that help us speed up the equation collecting phase of the linear sieve 
method are very similar to those employed in the quadratic sieve algorithm for 
integer factorization (See [1151 13j for details). 

We first recall that at the beginning of each sieving step, we find a solution 
for C 2 modulo in the congruence T(ci, C 2 ) = 0 (mod q^) for every small prime 
q in the factor base and for a set of small exponents h. The costliest operation 
that need be carried out for each such solution is the computation of a modular 
inverse (namely, that of H+ci modulo q^). As described in 0 and as is evident 
from our experiments too, calculations of these inverses take more than half of 
the CPU time needed for the entire equation collecting stage. Any trick that 
reduces the number of computations of the inverses, speeds up the algorithm. 

One way to achieve this is to solve the congruence every time only for h = 1 
and ignore all higher powers of q. That is, for every q (and ci), we check which 
of the integers T(ci, C 2 ) are divisible by q and then add Ig q to the corresponding 
indices of the array 21. If some P(ci,C 2 ) is divisible by a higher power of q, this 



302 



A. Das and C.E. Veni Madhavan 



strategy fails to add Ig q the required number of times. As a result, this T(ci, C 2 ), 
even if smooth, may fail to pass the ‘closeness criterion’ described in Section 2. 
This is, however, not a serious problem, because we may increase the cut-off 
from a value smaller than Ig qt to a value C Ig qt for some C ^ 1- This means that 
some non-smooth T(ci,C 2 ) will pass through the selection criterion in addition 
to some smooth ones that could not, otherwise, be detected. This is reasonable, 
because the non-smooth ones can be later filtered out from the smooth ones and 
one might use even trial divisions to do so. For primes p of less than 200 bits, 
values of C < 2.5 work quite well in practice W- 

The reason why this strategy performs well in practice is as follows. If q 
is small, for example q = 2, we should add only 1 to Slcj for every power of 
2 dividing T(ci,C 2 ). On the other hand, if q is much larger, say q = 1299709 
(the lO^th prime), then Igq ^ 20.31 is large. But T(ci,C 2 ) would not be, in 
general, divisible by a high power of this q. The approximate calculation of 
logarithm of the smooth part of T(ci,C 2 ), therefore, leads to a situation where 
the probability that a smooth T(ci,C 2 ) is actually detected as smooth is quite 
high. A few relations would be still missed out even with the modified ‘closeness 
criterion’, but that is more than compensated by the speed-up gained by the 
method. 

The above strategy helps us in a way other than by reducing the number of 
modular inverses. We note that for practical values of p, the small primes in the 
factor base are usually single-precision ones. As a result, the computation of d 
(See Section 2) can be carried out using single-precision operations only. 

Throughout the rest of this section we compare the performance of the mo- 
dified strategy with that of the original strategy for a value of p of length aro- 
und 150 bits. This prime is chosen as a random one satisfying the conditions 
(i) {p — 1) /2 is also a prime, and (ii) p is close to a whole cube. This second 
condition is necessary, because for these primes, the cubic sieve algorithm is also 
applicable, so that we can compare the performance of the two sieve algorithms 
for these primes. Our experiments are based on the Galois Field Library routines 
developed by the authors [Ij and are carried out on a 200 MHz Pentium machine 
running Linux version 2.0.34 and having 64 Mb RAM. The GNU G Gompiler 
version 2.7 is used. 



Table 1. Performance of the linear sieve algorithm 



p = 1320245474656309183513988729373583242842871683 
t = 7000, M = 30000 



Algorithm 


c 


No. of 

Relations (p) 


No. of 

Variables (S') 


p/i' 


CPU Time 
(seconds) 


Exact 


0.1 


108637 


67001 


1.6214 


225590 


Approximate 


1.0 


108215 


67001 


1.6151 


101712 


1.5 


108624 


67001 


1.6212 


101818 


2.0 


108636 


67001 


1.6214 


102253 


2.5 


108637 


67001 


1.6214 


102250 
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In Table 1 we compare the performance of the ‘exact’ version of the algorithm 
(where all relations are made available by choosing values of ft. > 1) with that of 
the ‘approximate’ version of the algorithm (in which powers ft > 1 are neglected). 
The CPU times listed in the table do not include the time for filtering out the 
‘spurious’ relations obtained in the approximate version. It is evident from the 
table that the performance gain obtained using the heuristic variant is more than 
2. It’s also clear that values of ( between 1.5 and 2 suffice for fields of this size. 



5 An Efficient Implementation of the Cubic Sieve 
Method 



For the cubic sieve method, we employ strategies similar to those described in 
the last section. That is, we solve the congruence R(A, B,C) =0 (mod g) for 
each small prime q in the factor base and ignore higher powers of q that might 
divide R(A, B, C). As before, we set the cut-off at ( Ig qt for some C > 1- We are 
not going to elaborate the details of this strategy and the expected benefits once 
again in this section. We concentrate on an additional heuristic modification of 
the equation collecting phase instead. 

We recall from Section 3 that we check the smoothness of R{A, B, C) for 
—M < A < B < C < M. With this condition, C varies from 0 to M. We 
note that for each value of C, we have to execute the entire sieving operation 
once. For each such sieving operation (that is, for a fixed C), the sieving interval 
for B is (i.e. the admissible values of B are) —Cj2 < B < min(C, M — C). 
Correspondingly A = —{B+C) can vary from max(— 2C, —M) to —Cj2. It’s easy 
to see that in this case total number of triples (A, B, C) for which the smoothness 



of R{A, B, C) is examined is r = 



M 

E 

c=o 



1 + [C/2\ + min(C, M-C)] « /2. 



The number of unknowns, that is, the size of the factor base, on the other hand, 
is jz « 2M + t. 

If we remove the restriction A > —M and allow A to be as negative as —AM 
for some 1 < A < 2, then we are benefitted in the following way. As before, we 
allow C to vary from 0 to M keeping the number of sieving operations fixed. 
Since A can now assume values smaller than — M, the sieving interval increases 
to —Cj2 < B < min(C, AM — C). As a result, the total number of triples 

/ \ 

(A, B, C) becomes t\ — ^ f H- [C/2J -|-min(C', XM — C) j « — — (4A — A^ — 1), 

whereas the size of the factor base increases to ~ (A -I- 1)M -|- t. (Note that 
with this notation the value A = 1 corresponds to the original algorithm and 
T = Ti and ly = lyi.) The ratio t\/v\ is approximately proportional to the 
number of smooth integers i?(A, B, C) generated by the algorithm divided by 
the number of unknowns. Therefore, A should be set at a value for which this 
ratio is maximum. If one treats t and M as constants, then the maximum is 

attained at A* = —U + y/lT^ + AU + 1, where U = ^ = I -I- — . As we 

* > TV /T 71 /T 
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increase U from 1 to oo (or, equivalently the ratio t/M from 0 to oo), the value 
of A* increases monotonically from -\/6 — 1 « 1.4495 to 2. In the following table 
(Table 2), we summarize the variation of t\/v\ for some values of U. These 
values of U correspond from left to right to 1 <C M, t m M/2, t ~ M and 
t « 2M respectively. The corresponding values of A* are respectively 1.4495, 
1.5414, 1.6056 and 1.6904. It’s clear from the table, that for practical ranges of 
values of U, the choice A = 1.5 gives performance quite close to the optimal. 



Table 2. Variation of T\jv\ with A 





T\/v\ (approx) 


A 


U = 1 


U = 1.5 


U ^2 


U = 3 


1 


0.2500 M 


0.2000 M 


0.1667M 


0.1250 M 


1.5 


0.2750 M 


0.2292 M 


0.1964 M 


0.1527M 


2 


0.2500 M 


0.2143 M 


0.1875 M 


0.1500 M 


A* 


0.2753 M 


0.2293 M 


0.1972 M 


0.1548 M 



We note that this scheme keeps M and the range of variation of C constant 
and hence does not increase the number of sieving steps and, in particular, the 
number of modular inverses and square roots. It is, therefore, advisable to apply 
the trick (with, say, A = 1.5) instead of increasing M. With that one is expected 
to get a speed-up of about 10 to 20% and obtain a larger database. 

In what follows, we report about the performance of the cubic sieve algorithm 
for various values of the parameters C and A. We also compare the performance 
of the cubic sieve algorithm with that of the linear sieve algorithm. We work in 
the prime field Fp with 

p = 1320245474656309183513988729373583242842871683 
as in the last section. For this prime, we have 

X = -f 1 = 1097029305312372, Y =1,Z = 31165 
as a solution of (3). 

To start with, we fix A = 1.5 and examine the variation of the performance 
of the equation collecting stage with We did not implement the ‘exact’ version 
of this algorithm in which one tries to solve (6) for exponents h > 1 of q. Table 3 
lists the experimental details for the ‘approximate’ algorithm. As in Table 1, 
the CPU times do not include the time for filtering out the spurious relations 
available by the more generous closeness criterion for the approximate algorithm. 
For the cubic sieve method, the values of C around 1.5 works quite well for our 
prime p. 

In Table 4, we fix C at 1.5 and tabulate the variation of the performance of the 
cubic sieve algorithm for some values of A. It’s clear from the table that among 
the cases observed, the largest value of the ratio p/i> is obtained at A = 1.5. 





Discrete Logarithms over Prime Fields 



305 



Table 3. Performance of the cubic sieve algorithm for various values of ^ 
p = 1320245474656309183513988729373583242842871683 



t = 10000, M = 10000, A = 1.5 



c 


No. of 

Relations (p) 


No. of 

Variables (i>) 


p/iz 


CPU Time 
(seconds) 


1.0 


54805 


35001 


1.5658 


43508 


1.5 


54865 


35001 


1.5675 


43336 


2.0 


54868 


35001 


1.5676 


43492 



(The theoretical maximum is attained at A « 1.6) We also note that changing 
the value of A incurs variation in the running time by at most 1%. Thus our 
heuristic allows us to build a larger database at approximately no extra cost. 



Table 4. Performance of the cubic sieve algorithm for various values of A 
p = 1320245474656309183513988729373583242842871683 



t = 10000, M = 10000, C = 1-5 



A 


No. of 

Relations (p) 


No. of 

Variables (p) 


p/iz 


CPU Time 
(seconds) 


1.0 


43434 


30001 


1.4478 


43047 


1.5 


54865 


35001 


1.5675 


43336 


1.6 


56147 


36001 


1.5596 


43347 


2.0 


58234 


40001 


1.4558 


43499 



5.1 Performance Comparison With Linear Sieve 

The speed-up obtained by the cubic sieve method over the linear sieve method 
is about 2.5 for the field of size around 150 bits. For larger fields, this speed-up 
is expected to be more. It is, therefore, evident that the cubic sieve algorithm, 
at least for the case a = 1/3, runs faster than the linear sieve counterpart for 
the practical range of sizes of prime fields. 

6 Conclusion 

In this paper, we have described various practical aspects for efficient imple- 
mentation of the linear and the cubic sieve algorithms for the computation of 
discrete logarithms over finite fields. We have also compared the performances 
of these two algorithms and established the superiority of the latter method over 
the former for the cases when p is close to a whole cube. It, however, remains 
unsettled whether the cubic sieve algorithm performs equally well for a general 
prime p. More importantly, the applicability of the cubic sieve algorithm banks 
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on the availability of a ‘favorable’ solution of a certain Diophantine equation. 
Finding an algorithm for computing the solution of this Diophantine equation or 
even for certifying if a solution exists, continues to remain an open problem and 
stands in the way of the general acceptance of the cubic sieve algorithm. Last 
but not the least, we need performance comparison of the cubic sieve method 
with the number field sieve method. 
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Abstract. We present external memory algorithms for outerplanarity 
testing, embedding outerplanar graphs, breadth-first search (BPS) and 
depth-first search (DPS) in outerplanar graphs, and finding a |-separator 
of size 2 for a given outerplanar graph. Our algorithms take 0{sort{N)) 
I/Os and can easily be improved to take 0{perm{N)) I/Os, as all these 
problems have linear time solutions in internal memory. Por BPS, DPS, 
and outerplanar embedding we show matching lower bounds. 



1 Introduction 

Motivation. Outerplanar graphs are a well-studied class of graphs (e.g., see 
mm)- This class is restricted enough to admit more efficient algorithms than 
those for general graphs and general enough to have practical applications, e.g. 

0 . 

Our motivation to study outerplanar graphs in external memory is three- 
fold. Firstly, efficient algorithms for triangulating and separating planar graphs 
were presented in [SE2]. The major drawback of their separator algorithm is 
that it requires an embedding and a BFS-tree of the given graph as part of the 
input. Embedding planar graphs and BFS in general graphs are hard problems 
in external memory0 Our goal is to show that these problems are considerably 
easier for outerplanar graphs. Secondly, outerplanar graphs can be seen as com- 
binatorial representations of triangulated simple polygons and their subgraphs. 
Thirdly, every outerplanar graph is a planar graph. Thus, any lower bound that 
we can show for outerplanar graphs also holds for planar graphs. 

Model of computation. When the data set to be handled becomes too large to fit 
into the main memory of the computer, the transfer of data between fast inter- 
nal memory and slow external memory (disks) becomes a significant bottleneck. 
Existing internal memory algorithms usually access their data in a random fas- 
hion, thereby causing significantly more I/O operations than necessary. Our goal 
in this paper is to minimize the number of I/O operations performed. Several 
computational models for estimating the I/O-efRciency of algorithms have been 

* Research supported by NSERC and NCE GEOIDE. 

^ No external memory algorithm for embedding planar graphs is known. For BFS in 
general graphs, the best known algorithm takes 0(|U| -I- lEI/IUIsortdUl)) I/Os [^. 
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developed. We adopt the parallel disk model PDM [TT] as our model of compu- 
tation for this paper due to its simplicity, and the fact that we consider only 
a single processor. In the PDM, an external memory, consisting of D disks, is 
attached to a machine with memory size M data items. Each of the disks is 
divided into blocks of B consecutive data items. Up to D blocks, at most one 
per disk, can be transferred between internal and external memory in a single 
I/O operation. The complexity of an algorithm is the number of I/O operations 
it performs. 

Previous work. For planar graphs, a linear-time algorithm for finding a |- 
separator of size O presented in [7]. It is well known that every 

outerplanar graph has a | -separator of size 2 and that such a separator can be 
computed in linear time. Outerplanarity testing and embedding outerplanar 
graphs take linear time. There are simple linear time algorithms for BPS and 
DPS in general graphs (see m . Refer to | 4I6| for a good exposition of outerplanar 
graphs. 

In the PDM, sorting, permuting, and scanning an array of size N take 
sort{N) = 0 perm{N) = (9 (min {iV, ■sort(A^)}), and scan{N) = 

0(^) I/Os mm- For a comprehensive survey of external memory algo- 
rithms, refer to m- The best known BFS-algorithm for general graphs takes 
0^|^sort(|U|) -I- |U|^ I/Os [H]. In [T], 0{sort{N)) algorithms for computing an 
open ear decomposition and the connected and biconnected components of a 
given graph G = (V,E) with \E\ = 0(|U|) were presented. They also develop a 
technique, called time-forward proeessing, that can be used to evaluate a directed 
acyclic graph of size N, viewed as a (logical) circuit, in 0{sort{N)) I/Os. They 
apply this technique to develop an 0{sort{N)) algorithm for list-ranking. An 
external memory separator algorithm for planar graphs has been presented in 
[151 12| : it takes 0{sort{N)) I/Os provided that a BFS-tree and an embedding of 
the graph is given. We do not know of any other results for computing separators 
in external memory efficiently. Also, no efficient external memory algorithms for 
embedding planar or outerplanar graphs in the plane are known [H 

Our results. In this paper, we show the following theorem. 

Theorem 1. It takes 0{perm{N)) I/Os to decide whether a given graph G with 
N vertices is outerplanar. If G is outerplanar, breadth-first search, depth-first 
search, and computing an outerplanar embedding of G take 0{perm{N)) I/Os. 
Computing a ‘^-separator of size 2 for G takes 0{perm{N)) I/Os. 

Preliminaries. A graph G = (V, E) is a pair of sets, V and E. V is the vertex 
set of G. E is the edge set of G and consists of unordered pairs {r),w}, where 

^ One can use the PRAM simulation technique of jl] together with known PRAM 
results. Unfortunately, the PRAM simulation introduces 0{sort{N)) I/Os for every 
PRAM step, and so the resulting I/O complexity is not attractive for our purposes. 
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v,w € V. In this paper, = |I^|. A path in G is a sequence, P = (vg, . . . , Vk), of 
vertices such that {ui-i, Vi} € E, for 1 < f < fc. A graph is connected if there is a 
path between any pair of vertices in G. A graph is biconnected if for every vertex 
V £ V , G — V is connected. A tree with N vertices is a connected graph with 
N —1 edges. A subgraph of G is a graph G' = (V, E') with V' GV and E' C E. 
The connected components of G are the maximal connected subgraphs of G. The 
biconnected components (bicomps) of G are the maximal biconnected subgraphs 
of G. A graph is planar if it can be drawn in the plane so that no two edges 
intersect except at their endpoints. This defines an order of the edges incident 
to every vertex u of G counterclockwise around v. We call G embedded if we are 
given these orders for all vertices of G. \ (t^ Uif) is a set of connected regions. 
Call these regions the faces of G. Denote the set of faces of G by F. A graph 
is outerplanar if it can be drawn in the plane so that there is a face that has 
all vertices of G on its boundary. For every outerplanar graph, |F| < 2\V\ — 3. 
The dual of an embedded planar graph G = (V, E) with face set F is a graph 
G* = (V*,E*), where V* = E and E* contains an edge between two vertices in 
V* if the two corresponding faces in G share an edge. 

An ear decomposition of a biconnected graph G is a decomposition of G into 
edge-disjoint paths Pg , . . . , Pk, Pi = {vi , . . . , Wi), such that G = Uj=o = 

{vg,wg), and for every Pi, i > 1 , PiH Gi-i = {vi,Wi}, where G*_i = U}=o^j- 
The paths Pj are called ears. An open ear decomposition is an ear decomposition 
such that for every ear Pi, Vi ^ Wi. 

Let w : V ^ K+ be an assignment of weights to the vertices of G such that 
^ 1- The weight of a subgraph of G is the sum of the vertex weights 
in the subgraph. A ^-separator of G is a set S such that none of the connected 
components of G — F has weight exceeding | . 

We will use the following characterization of outerplanar graphs [4]. 

Theorem 2. A graph G is outerplanar if and only if it does not eontain a 
subgraph that is an edge expansion of ^ 2,3 or K 4 . 

In our algorithms we represent a graph G as the two sets V and F. The 
embedding of a graph is represented as labels n„(e) and n^(e) for every edge 
e = {v, w}, where nv{e) and nu,(e) are the positions of e in the counterclockwise 
orders of the edges around v and w, respectively. 

2 Embedding and Outerplanarity Testing 

We show how to compute a combinatorial embedding (i.e., edge labels n«(e) 
and nui(e)) of a given outerplanar graph G. Our algorithm for outerplanarity 
testing is based on the embedding algorithm. We restrict ourselves to biconnec- 
ted outerplanar graphs. If the given graph is not biconnected, we compute the 
biconnected components in 0{sort{N)) I/Os PQ. Then we compute embeddings 
of the biconnected components and join them at the outpoints of the graph. 

Our algorithm for embedding biconnected outerplanar graphs G consists of 
two steps. In Step 1 we compute the cycle G in G that represents G’s outer 
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boundary in the embedding. In Step 2 we embed the remaining edges, which 
are diagonals of C. Step 1 relies on Observation |T] Let £c = {Po, ■ ■ ■ ,Pk) be the 
given open ear decomposition. We call an ear Pj, i > 1, trivial if it consists of a 
single edge. Otherwise we call it non-trivial. Ear Pq is non-trivial by definition. 
Let Gi = Uj=o ^ 3 - Given an embedding of Gi, we call two vertices, v and ic, of 
Gi consecutive if there is an edge {u,w} in Gi that is on the outer boundary of 
G,. 

Observation 1. Given a decomposition of a biconnected outerplanar graph G 
into open ears Pq, . . . , Pfe and an embedding of G. Then either Pi is trivial or 
the endpoints of Pi are consecutive in Gi-i, for 1 <i < k. 

Observation [T] implies that, except if a non-trivial ear Pi is attached to the 
endpoints of Pq, there is exactly one non-trivial ear Pj, j < i, that contains 
both endpoints of Pi. We represent this relationship between non-trivial ears in 
the ear tree Te of G (see Fig. 1). This tree contains a node Ok for every non- 
trivial ear P^. Node aj is the parent of node ai {aj = p{ai)) if Pj contains both 
endpoints of Pi. If Pfs endpoints are also Pq’s endpoints, ai is the child of oq- 
The vertices in Pi must appear in the same order along the outer boundary of 
G as they appear in P^. Using these observations, we construct the order of the 
vertices along G using a depth-first traversal of Te as follows: 

Start the traversal of Te at node ag. At a node Oj, we traverse the ear Pi 
and append Pfs vertices to G. When we reach an endpoint of an ear Pj with 
ai = p{aj), we recursively traverse the subtree rooted at aj. When we are done 
with the traversal of that subtree, we continue traversing Pi. 

To construct the final embedding, we use the following observation: Let 
Cl,. . . ,€d be the edges incident to a vertex v, sorted clockwise around v. Let 
Ui be the other endpoint of edge e^, for 1 < i < d. Then the vertices Ui, ... ,Ud 
appear in clockwise order along the outer boundary of G (see Fig. 1). 

Lemma 1. An outerplanar embedding of a given outerplanar graph G with N 
vertices can be computed in 0{sort{N)) I/Os. 
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Proof sketch. Open ear decomposition: The open ear-decomposition of G can be 
computed in 0{sort{N)) I/Os (T]. We scan the list, £c, of ears to construct the 
list, £q, of non-trivial ears. 

Ear tree construction: Let u be a vertex that is interior to ear Pi. (Note that 
every vertex, except for the two endpoints of Pq, is interior to exactly one ear.) 
Then we define v's ear number as e(u) = i. For the two endpoints, vq and wg, 
of Pq we define e(vo) = e(wo) = 0. Let ai = p(aj) in Tg. It can be shown that 
i = max{e(vj), e(wj)}. We scan the list of non-trivial ears to compute all ear 
numbers. The ear tree can then be constructed in 0{sort{N)) I/Os. 
Construction of C: Consider the ears corresponding to the nodes stored in a 
subtree TE^otj) oi Te rooted at a node Oj. The internal vertices of these ears 
are exactly the vertices that appear between the two endpoints, Vj and wj, of 
Pj on the outer boundary of G, i.e., in C. The number of these vertices can be 
computed as follows: We assign the number of internal vertices of ear Pj as a 
weight w{aj) to every node aj and use time-forward processing to compute the 
subtree weights w{TE{aj)) of all subtrees TE{aj). 

Now we sort the ears Pi in £'q by their indices i. Scanning this sorted list 
of ears corresponds to processing Te from the root to the leaves. We use time- 
forward processing to send small pieces of additional information down the tree. 
We start at node ag and assign n{vg) = 0 and n{wg) = w{Te{cx.i)) + 1, where 
n{v) is u’s position in a clockwise traversal of C . For every subsequent ear, Pj, 
we maintain the invariant that when we start the scan of Pj, we have already 
computed n{vj) and n{wj). Then we scan along Pj and number the vertices of 
Pj in their order of appearance. Let x be the current vertex in this scan and y 
be the previous vertex. If there is no ear attached to x and y, n(x) = n(y) + I. 
Otherwise, let ear Pi be attached to x and y (i.e., {x,y} = {vi,Wi}). Then 
n(x) = n{y) + w{TE{oti)) -I- 1. We send n{vi) and n{wi) to ai so that this 
information is available when we process Pi. Note that the current ear Pj might 
be stored in reverse order (i.e., n{wj) < n{vj)). This can easily be handled using 
a stack to reverse Pj before scanning. 

Computing the final embedding: First, we relabel the vertices of G in their 
order around C. Then we replace every edge e = {u,r(;} by two directed edges 
{v, w) and {w, v). We sort the list of these edges lexicographically and scan it to 
compute the desired labels n^(e) and n^ie). □ 

Note that ear Pg requires some special treatment because it is the only ear 
that can have two non-trivial ears attached on two different sides. However, the 
details are fairly straightforward and therefore omitted. The above algorithm 
can be augmented to do outerplanarity testing in 0{sort{N)) I/Os. Due to space 
constraints, we refer the reader to the full version of the paper. 

3 Breadth-First and Depth-First Search 

We can restrict ourselves to BFS and DFS in biconnected outerplanar graphs. If 
the graph is not biconnected, we compute its connected and biconnected com- 
ponents in 0{sort{N)) I/Os [T]. Then we represent every connected component 
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by a rooted tree describing the relationship between its bicomps and outpoints. 
We process each such tree from the root to the leaves, applying Lemmas [3] and 
|2]to the bicomps of the graph, in order to compute a BFS resp. DFS-tree for 
every connected component. 

Given a biconnected outerplanar graph G, we first embed it. The list C 
computed by our embedding algorithm contains the vertices of G sorted coun- 
terclockwise along the outer boundary of G. Given a source vertex r, a path 
along the outer boundary of G, starting at r, is a DFS-tree of G. Thus, we can 
compute a DFS-tree of G by scanning G. 

Lemma 2. Given a biconnected outerplanar graph G with N vertices, depth- 
first search in G takes 0{sort{N)) I/Os. 

The construction of a BFS-tree is based on the following observation: Let the 
vertices of G be numbered counterclockwise around the outer boundary of G and 
such that the source, r, has number 1. Let vi < ■ ■ ■ < Vk he the neighbours of r. 
The removal of these vertices partitions G into subgraphs Gi, 1 < i < k, induced 
by vertices Ui -I- 1, . . . , Vi+i — 1. Indeed, if there was an edge {u, w} between two 
such graphs Gi and Gj, i < j, G would contain a K 4 consisting of the paths 
between r, v, w, Vi, vj, and vj+i. We consider graphs Gi that are induced by 
vertices Vi, . . . ,Vi+\. Let e and e' be two edges in such a graph Gi such that 
the left endpoint of e is to the left of the left endpoint of e! . Then either e is 
completely to the left of d or e spans d . 

The vertices Vi and Ui+i are at distance 1 from the root r, and the shortest 
path from any vertex in Gi to r must contain Vi or Ui+i. Thus, we can do BFS 
in Gi by finding the shortest path from every vertex in Gi to either Vi or 
whichever is shorter. We build an edge tree for Gi (see Fig. 2). Tg contains a 
node v(e) for every edge e in Gi. Node v(e) is the parent of another node v(e') 
if edge e spans edge e' and there is no edge e" that spans e' and is spanned by 
e. Tf, has an additional root node p, which is the parent of all nodes u(e), where 
edge e is not spanned by any other edge. The level of an edge e in Gi is the 
distance of node v(e) from the root p of Tg. 

We call an edge e = {u,w} incident to vertex w a left (resp. right) edge if 
V < w (resp. V > w). The minimal left (resp. right) edge for w has the minimal 
level in Tg among all left (resp. right) edges incident to w. The shortest path 
from w to Vi (uj+i) must contain either the minimal left edge, e = {v,w}, or 
the minimal right edge, d = {d ,w}, incident to w (see Fig. 2). Indeed, the 
shortest path must contain v or d , and e (resp. d) is the shortest path from 



External Memory Algorithms for Outerplanar Graphs 313 



w to V (resp. v'). Thus, we can process Tg level by level from the root towards 
the leaves maintaining the following invariant: When we visit a node v{e) in Te, 
where e = {ujw}, we know the distances d(r,v) and d(r,w) from v and w to 
the source vertex r. Assume that u(e)’s children are sorted from left to right. 
We scan the list of the corresponding edges from left to right to compute the 
distances of their endpoints to u. In a right-to-left scan we compute the distances 
to w. Then it takes another scan to determine for every edge endpoint x, the 
distance d{r, x) = min{<i(r', v) + d{v, x),d{r, w) + d{w, a;)} and to set the parent 
pointers in the BFS-tree properly. This computation ensures the invariant for 
v{e)’s children. 

Lemma 3. Given a biconnected outerplanar graph G with N vertices, breadth- 
first search in G takes 0{sort{N)) I/Os. 

Proof sketch. Constructing Gi, . . . , Gk-i: We use our embedding algorithm to 
embed G and compute the order of the vertices of G along the outer boundary 
of G. Given this ordering, we use sorting and scanning to split G into subgraphs 
G,. 

Constructing Tg: We sort the list, Vi, of vertices of Gi from left to right and 
store with every vertex w, the list, E(w), of edges incident to w, sorted clockwise 
around w. Then we scan Vi and apply a simple stack algorithm to construct Tg\ 
We initialize the stack by pushing p on the stack. When visiting vertex w, we 
first pop the nodes v{e) for all left edges e = {v, u>} in E{w) from the stack and 
make the next node on the stack the parent of the currently popped node. Then 
we push the nodes v{e) for all right edges in E{w) on the stack, in their order 
of appearance. 

Breadth-Hrst search in Gi: We sort the edges of Gi by increasing level and from 
left to right and use time-forward processing to send the computed distances 
downward in Tg. Scanning the list of children of the currently visited node u(e) 
takes 0{scan{N)) I/Os for all nodes of Tg because the edges of every level are 
sorted from left to right, and the right-to-left scans can be implemented using a 
stack. □ 

4 Separating Outerplanar Graphs 

We assume that G is connected and no vertex in G has weight greater than | . If 
there is a vertex v with weight w{v) > |, S' = {u} is trivially a |-separator of G. 
If G is disconnected, we compute G’s connected components in 0{sort{N)) I/Os 
IPP. If there is no component of weight greater than |, S = 0 is a |-separator 
of G. Otherwise, we compute a separator of the connected component of weight 
greater than |. 

Our strategy for finding a size-2 separator of G is as follows: First we embed 
G and make G biconnected by adding appropriate edges to the outer face of 
G. Then we triangulate the interior faces of the resulting graph and compute 
the dual tree T* corresponding to the interior faces of the triangulation Ga of 
G. Every edge in T* corresponds to a diagonal of the triangulation, whose two 
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endpoints are a size-2 separator of Ga and thus of G. We assign appropriate 
weights to the vertices of T* that allow us to find a tree edge that corresponds 
to a I -separator of Ga- 

We use the following observations: (1) G is biconnected if and only if its 
outer face is simple. (A face is simple, if every vertex appears at most once in 
a counterclockwise traversal of its boundary.) (2) If G is a cycle, we can choose 
one of the two faces of G as the outer face. Otherwise, the outer face of G is the 
only face that has all vertices of G on its boundary. 

The triangulation algorithm for planar graphs in first makes all faces of the 
given graph simple and then triangulates the resulting simple faces. Using the 
two observations just made, this triangulation algorithm can easily be modified 
to triangulate all faces of G except the outer face, which is only made simple. 

Lemma 4. A size- 2 ^-separator of a given outerplanar graph G with N vertices 
can he computed in 0{sort{N)) I/Os. 



Proof sketch. We have to show how to construct the dual tree T* corresponding 
to the interior faces of Ga and how to find an edge of T* corresponding to a 
|-separator of Ga- 

Constructing the dual tree: We first number the vertices 0 through N — 1 clock- 
wise around Ga- With every vertex v we store its adjacency list A{v) sorted 
counterclockwise around v, where {v — 1) mod N is the first vertex in A(v). 
Denote the concatenation of A(0), . . . , A{N — 1) by A. We construct T* recur- 
sively (see Fig. 3). We start at edge {a,b} = {0, — 1} and consider triangle 

Ai = (a, b, x). 

We make the vertex u(Ai) corresponding to Ai the root of T* . Let A 2 = 
{a,x,y) and A 3 = (x,b,z) be the two triangles adjacent to Ai. Then the cor- 
responding vertices v{A 2 ) and ^(Aa) are the children of u(Ai). We recursively 
construct the subtrees of T* rooted at v{A 2 ) and ^(As), calling the tree con- 
struction procedure with parameters {ai,5i} = {a,x} and { 02 , 62 } = {x,b}, 
respectively. Using this strategy we basically perform a depth-first traversal of 
T* . The corresponding Euler tour (as represented by the dashed line in Fig. 3) 
crosses the edges of Ga in their order of appearance in A. Thus, T* can be 
constructed in a single scan over A and using 0{N) stack operations. 



External Memory Algorithms for Outerplanar Graphs 315 



Finding the separator: At every recursive call we assign weights w(t;(Ai)) = w{x) 
and tCp(u(Ai)) = w{a) + w(h) to the newly created vertex u(Ai). Let T*{v) be 
the subtree of T* rooted at a vertex v. During the construction of T* we can 
compute for every vertex u, the weight w{T*{v)) = J2ueT*{v) of the subtree 
T*{v). The removal of the edge connecting u(Ai) to its parent in T* corresponds 
to removing vertices a and b from Ga- This partitions Ga into two subgraphs 
of weights w{T* {v{Ai))) and w{Ga) — w{T* {v{Ai))) — Wp{v{Ai)). Thus, once 
the weights w{T*(v)) and Wp{v) have been computed for every vertex v of T* , 
it takes a single scan over the vertex list of T* to compute a size-2 |-separator 
of Ga and thus of G. □ 

5 Lower Bounds 

In this section, we prove matching lower bounds for all results in this paper, 
except for computing separators. We show these lower bounds by reducing list- 
ranking, which has an f2{perm{N)) lower bound [T], to DFS, BFS, and embed- 
ding of biconnected outerplanar graphs. The list-ranking problem is defined as 
follows: Given a singly linked list L and a pointer to the head of L, compute for 
every node of L its distance to the tail. 

Note that list-ranking reduces trivially to DFS and BFS in general outerplanar 
graphs, as we can consider the list itself as an outerplanar graph and choose the 
tail of the list as the source of the search. 

Lemma 5. List-ranking can be reduced to computing a combinatorial embedding 
of a biconnected outerplanar graph in 0{scan{N)) I/Os. 

Proof sketch. Given the list L, we can compute the tail t oi L (t is the node with 
no successor) and node t' with succ{t') = t in two scans over L. We consider 
L as a graph with an edge between every vertex and its successor. In another 
scan, we add edges {u,t} to L, for v / {t,t'}- This gives us a graph G\ as in 
Fig. 4(a). It can be shown that the outerplanar embedding of G\ is unique except 
for flipping the whole graph. Thus, the rank of in L is the position of edge 
{u, t} in clockwise or counterclockwise order around t. □ 



Lemma 6. List-ranking can be reduced to breadth-first search and depth-first 
search, respectively, in biconnected outerplanar graphs in 0{scan{N)) I/Os. 

Proof sketch. We only show the reduction for BFS. The reduction to DFS is 
even simpler. As in the proof of the previous lemma, we identify the tail t of 
L in a single scan. Then we add an edge {h,t} to L. This produces a cycle 
G 2 . We perform two breadth-first searches (see Fig. 4(b)), one with source t 
(labels outside) and one with source h (labels inside). It is easy to see that 
the distance d{v) of every node to the tail of the list can now be computed as 
d{v) = A' — 1 — d{h, v) if d{h, v) < d{t, v), and d{v) = d{t, v) otherwise. □ 
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Fig. 4 



Proof sketch ( Theorem \^. The theorem follows from the lemmas in this pa- 
per, if we can reduce the upper bounds from 0{sort(N)) to 0{perm{N)) I/Os. 
Let V be the problem at hand, A be an 0{N) time internal memory algo- 
rithm that solves V, and A' be an 0{sort{N)) external memory algorithm that 
solves V ■ (For the problems studied in this paper, linear time solutions in in- 
ternal memory are known, and we have provided the 0{sort(N)) algorithms.) 
We run algorithms A and Af in parallel, switching from one algorithm to the 
other at every I/O operation. The computation stops as soon as one of the algo- 
rithms terminates. At this point algorithms A and Al have performed at most 
imn{0{N),0{sort{N))} = 0{perm{N)) I/Os. □ 
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Abstract. This paper presents a new approximation algorithm for a 
vehicle routing problem on a tree-shaped network with a single depot. 
Customers are located on vertices of the tree. Demands of customers are 
served by a fleet of identical vehicles with limited capacity. It is assumed 
that the demand of a customer is splittable, i.e., it can be served by 
more than one vehicle. The problem we are concerned with in this paper 
asks to find a set of tours of the vehicles with minimum total lengths. 
Each tour begins at the depot, visits a subset of the customers and 
returns to the depot. We propose a 1.35078-approximation algorithm for 
the problem (exactly, (-\/4T — l)/4), which is an improvement over the 
existing 1.5-approximation. 



1 Introduction 

In this paper we consider a capacitated vehicle routing problem on a tree-shaped 
network with a single depot. Let T = (V,E) be a tree, where U is a set of n 
vertices and A is a set of edges, and r G U be a designated vertex called depot. 
Nonnegative weight w{e) is associated with each edge e G E, which represents the 
length of e. Customers are located at vertices of the tree, and a customer at v G V 
has a positive demand D(v). Thus, when there is no customer at v, D(v) = 0 
is assumed. Demands of customers are served by a set of identical vehicles with 
limited capacity. We assume throughout this paper that the capacity of every 
vehicle is equal to one, and that the demand of a customer is splittable, i.e., 
it can be served by more than one vehicle. Each vehicle starts at the depot, 
visits a subset of customers to (partially) serve their demands and returns to 
the depot without violating the capacity constraint. The problem we deal with 
in this paper asks to find a set of tours of vehicles with minimum total lengths 
to satisfy all the demands of customers. We call this problem TREE-CVRP. 

* Research of this paper is partly supported by the Grant-in- Aid for Scientific Research 
on Priority Areas (B) by the Ministry of Education, Science, Sports and Culture of 
Japan. 
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Vehicle routing problems have long been studied by many researchers (see 
I3l5l for a survey), and are found in various applications such as scheduling of 
truck routes to deliver goods from a warehouse to retailers, material handling 
systems and computer communication networks. Recently, AGVs (automated 
guided vehicle) and material handling robots are often used in manufacturing 
systems, but also in offices and hospitals, in order to reduce the material handling 
efforts. The tree-shaped network can be typically found in buildings with simple 
structures of corridors and in simple production lines of factories. 

Vehicle scheduling problems on tree-shaped networks have recently been stu- 
died by several authors mmm . Most of them dealt with a single- vehicle sche- 
duling that seeks to find an optimal tour under certain constraints. 

However, TREE-CVRP has not been studied in the literature until very re- 
cently. Last year Hamaguchi and Katoh proved its NP-hardness and proposed 
a 1.5-approximation algorithm m considered the variant of TREE-CVRP where 
demand of each customer is not splittable and gave 2-approximation algorithm.) 

In this paper, we shall present an improved 1.35078-approximation algorithm 
for TREE-CVRP by exploiting the tree structure of the network. This is an 
improvement of the existing 1.5-approximation algorithm by Hamaguchi and 
Katoh |H] . A basic idea behind the improvement is the use of reforming operations 
preserving the lower bound on the cost, which simplifies the analysis. 

2 Preliminaries 

For vertices u,v € V, let path{u, v) be the unique path between u and v. The 
length of path{u,v) is denoted by w{path{u,v)). We often view T as a directed 
tree rooted at r. For a vertex v € V — {r}, let parent{v) denote the parent of v. 
We assume throughout this paper that when we write an edge e = (u,v), u is 
a parent of v. For any v € V, let Ty denote the subtree rooted at v, and w(Ty) 
and D{Ty) denote the sum of weights of edges in T„, and the sum of demands of 
customers in Ty, respectively. Since customers are located on vertices, customers 
are often identified with vertices. 

For an edge e = (it, v) , let 

LBie) = 2w{e)-\D{Ty)-]. (1) 

LB{e) represents a lower bound of the cost required for traversing edge e in an 
optimal solution because, due to the unit capacity of a vehicle, the number of 
vehicles required for any solution to serve the demands in T„ is at least \D{Ty)~\ 
and each such vehicle passes e at least twice (one is in a forward direction and 
the other is in a backward direction). Thus, we have the following lemma. 

Lemma 1. a lower bound of the optimal cost of TREE- 

CVRP. 

3 Reforming Operations 

Our approximation algorithm repeats the following two steps until all the de- 
mands are served. The first step is a reforming step in which we reshape a given 
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tree following seven different operations all of which are ’’safe” in the sense that 
they do not increase the lower bound on the cost given in Lemma [1] The second 
is to choose an appropriate subtree and choose among a few possible strategies 
depending on the cases the best one to serve the demands in the subtree. 

The first reforming operation R\ is applicable when some nodes have de- 
mands greater than or equal to 1. Suppose that a node v has a demand D(v) > 1. 
Then, we allocate k = )J vehicles to v to serve k units of its demand. This 
operation results in demand at v less than one. Note that this operation is ap- 
parently safe. Thus, it is assumed that each demand is less than one. 

The second operation i ?2 is to remove positive demand from each internal 
node. If there is any internal node v with positive demand, we create a new node 
connected with v by an edge of weight zero and descend the weight of v to the 
new node. It is easy to see that this operation is safe. Therefore, we can assume 
that positive demand is placed only at leaves. 

The third operation is applied to a pair of nodes (u, v) such that a leaf v 
is a unique child of u. If D{u) + D{v) < I, we contract the edge {u, v), i.e., delete 
(u,v) and the node v, after replacing the demand D{u) at u by D{u) + D{v) 
and then increasing the cost of the edge to u by w{u,v). On the other hand, if 
D{u) + D{v) > 1, then we send one vehicle to serve the full demand at v and 
serve a partial demand at u to fulfill the capacity, and then reduce the demand 
at u accordingly. The edge to v is removed together with v. 

The fourth operation R^ is to merge a subtree whose demand is less than or 
equal to 1 into a single edge. Namely, for an internal node v with D{Ty) < 1, 
Ty is replaced by a single edge {v,v') with edge weight equal to w{Ty) and 
D{v') = D{Ty). Since D{Ty) < 1 holds, this operation is also safe. 

To define more essential reform operations for approximation algorithm, we 
need some more assumptions and definitions. 

A node u of a tree T is called a p-node if 

(i) V is an internal node, and 

(ii) all of the children of v are leaves, and 

(iii) the sum of the demands at those children is between 1 and 2. 

A node u is called a q-node if 

(i) the sum of the demands in the subtree rooted at u, denoted by D(Tu), 
is at least 2, and 

(ii) no child of u has the property (i). 

The fifth operation i ?5 is to merge leaves of nodes. For a node u, let {vi,V 2 , 

. . . , Vk} be a subset of its children that are leaves. By Wi we denote the weight of 
the edge {u,Vi). We examine every pair of leaves. For the pair we check 

whether the sum of their weights exceeds 1. If D{vi) -I- D{vj) < I, then we merge 
them. Exactly speaking, we remove the leaf Vj together with its associated edge 
(u,Vj) after replacing the demand of Vi with D{vi) + D{vj) and the weight Wi 
with Wi + Wj . We repeat this process while there is any mergeable pair of leaves. 
Figure H] illustrates this merging process. This operation is useful particularly 
for p- and q-nodes to reduce the number of possible cases to be considered. 
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An important property of the resulting tree after performing to p-nodes 
is that any p-node has at most three children (leaves) since otherwise the sum 
of the demands of those children exceeds 2, a contradiction to the definition of 
a p-node. Thus, we can assume that any p-node has at most three children. 




before merge 




after merge 



Fig. 1. The merging operation. 



Now we have a limited number of situations for q-nodes listed as follows: 
Case (a): A q-node u has more than one p-node in its descendants. In this case, 
children of the q-node are either p-nodes or leaves (see Figure EJa)). 

Case (b): A q-node u has exactly one p-node v in its descendants. In this case 
we may have arbitrary number of internal nodes on the unique path from u to 
V each of which has only one edge connected to a leaf (see Figure E^b)). 

Case (c): A q-node u has no p-node v in its descendants. In this case, all of 
the children of the q-node are leaves (see Figure [21(c)). This is just like a p-node 
except that the sum of weights exceeds 2. 




(a) q-node has more than (b) q-node has one p-node 

one p-node 



Fig. 2. The three cases for a q-node. 

There are only two types of p-nodes since they have two or three children 
(leaves). The sixth reforming operation Rq removes p-nodes having only two 
children. This is done by connecting those children directly by the parent of 
the p-node. Formally, suppose a p-node v is connected to its parent u by an 
edge of weight a and to two children v\ and V 2 by edges of weights wi and W 2 , 
respectively. Then, vt is connected directly to u by an edge of weight Wi + a. 
Note that this operation is also safe. 

Yet another reforming operation is required in Cases (a) and (b) above. 
In these cases, for a p-node v, there may have some branches to leaves on the 
way from u to v. Then, those leaves are placed as the children of the p-node with 
edges of weights equal to those for the branches. In addition, internal nodes on 
the path from u to u are erased so that the path is replaced by a single edge with 
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the weight equal to that for the path. This operation also preserves the lower 
bound. Notice that when u has children which are leaves directly connected to 
u, such a leaf x can be a candidate for this operation if D{T^) + D(x) < 2 holds. 

After all, we can assume that if a q-node has any p-node as its child then the 
p-node must have three children. 

4 Approximation Algorithm 

The approximation algorithm to be presented in this paper is based on the 
reformations preserving the lower bound and combining two or more different 
strategies. The main difference from the previous approximation algorithm given 
by Hamaguchi and Katoh is the definition of a minimal subtree to which 
algorithmic strategies are applied. They introduced the notion of D-minimality 
and D-feasibility. That is, a vertex u G is called D- feasible if D{Ty) > 1 and 
is called D-minimal if it is D- feasible but none of its children is. Their algorithm 
first finds a D-minimal vertex, and determines a routing of one or two vehicles 
that (partially) serve demands of vertices in Ty by applying one of the two 
strategies depending on their merits. 

Our algorithm pays attention to subtrees of demands exceeding 2 instead of 
1. Usually this causes explosion of the possible cases, but the point here is that 
the reforming operations described above extremely simplify the possible cases. 
This is the main contribution of this paper. 

Now, let us describe our algorithm. It first applies seven reforming operations 
Ri through i ?7 to an input tree so that any of these operations cannot be applied 
any more. The algorithm consists of a number of rounds. In each round, it focuses 
on a particular q-node u and prepare a few strategies each of which allocates 
two, three, or four vehicles to (partially) serve the demands in the subtree T„. 
Among strategies prepared, we choose the best one and apply it to Ty. In the 
same manner as in jB], for each strategy, we compute the cost of tours required 
by the strategy. We also compute the decrease of the lower bound. Namely, let 
P denote the problem before applying the strategy. After applying the strategy, 
demands of nodes in the subtree Ty are decreased, we obtain another problem 
instance P' to which the algorithm will be further applied. The decrease of the 
lower bound is defined as LB{P) — LB(P'). The best strategy is the one giving 
the smallest ratio of the cost of tours to the decrease of the lower bound. As we 
shall show later, the smallest ratio is always at most 1.35078. 

When there is not any g-node any more, it is the final round and is called 
base case. In this case, similarly to the other cases, it will be shown that this 
ratio is also at most 1.35078. 

Theorem 1 . The approximation of our algorithm for TREE-CVRP is 1.35078. 

Proof. The proof technique is similar to the one by Hamaguchi and Katoh jH]. 
In fact, the theorem can be proved by induction on the number of rounds. 

Assuming that the theorem holds for problem instances that require at most 
k rounds, we consider the problem instance P of TREE-CVRP for which our 
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algorithm requires fc + 1 rounds. Each time we find a q-node and apply an appro- 
priate strategy based on the ratios defined above. Let P' be the problem instance 
obtained from P by decreasing demands served in this round. Let LB(P') be 
the lower bound for the problem P' and LB\ be the decreased lower bound at 
this round. Let cost{P), costi and cost(P') denote the total cost required for the 
original problem P by our algorithm, the cost required by the first round and the 
cost for the remaining problem P' to be required by our algorithm, respectively, 
(i.e., cost{P) = costi + cost{P')). Then, we have 

cost{P) costi + cost(P') , , 

LB{P) - LBi + LB{P') ■ ^ ’ 

Since cost(P') /LB(P') < 1.35078 holds from the induction hypothesis, it suffices 
to prove costi/ LBi < 1.35078. 

As we shall prove below (Lemmas 2 through 4), this inequality holds in every 
case (the base case will be proved in Lemma 5). Thus, we have the theorem. ED 

Recall that after each round, we obtain the new problem instance and we 
need to apply again reforming operations to the problem until any of seven 
reforming operations cannot be applied any more. Suppose there is at least one 
g-node. Strategies we prepare depend on the cases explained below. 

Case 1: A q-node has more than one p-node as its children. 

Case 2: A q-node has only one child of p-node. 

Case 3: A q-node has no child of p-node. 

In Case 1, we focus on arbitrary two p-nodes. Remaining p-nodes, if any, will 
be considered in later rounds of the algorithm. In Case 2, let u and v denote 
the q-node and the p-node respectively. From definition of a q-node, u has at 
least one child other than the p-node. We choose arbitrary one child v' other 
than the p-node. The algorithm then focuses on the subgraph consisting of edges 
(u,v), {u,v') and the subtree T„. We notice here that D{Ty) + D{v') > 2 since 
otherwise v' can be shifted down to become a child of v by reform operation Ry. 

In Case 3, we can assume that the q-node has at least three leaves from 
definition of the q-node. We arrange those leaves in any order and find where 
the sum of the demands exceeds 2. Recall that the merging operation is 
already applied, and thus the sum of demands of the first four leaves certainly 
exceeds 2. We can conclude that there are only two possibilities, that is, either 
that of the first three leaves exceeds 2 (Subcase 3A) or that of the first four does 
(Subcase 3B). 

Let us describe the algorithm for treating q-nodes depending on the above 
cases and then consider the base case. For each case, we shall explain how to 
schedule vehicles to serve demands of nodes that the algorithm focuses on, and 
prove that the ratio of the cost to the lower bound is at most 1.35078. In the 
proofs for all cases, we shall implicitly use the following simple facts: 

Fact 1: 0 < X < y and a > 0 < |. 

Fact 2: For p,g,r,s > 0, 0 < a < & and r > ^ ^ ^ 
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Case 1: Let u denote the q-node and let x and x' denote the two p-nodes the 
algorithm focuses on. The weight of the unique path from the tree root r to u 
is denoted by a, and those from m to a: and x' by b and b' , respectively. Let 
vi,V2 and V3 denote three leaves of the subtree T^. We denote by Di,Z?2 and 
L>3 their demands, and by wi,W2, and W3 weights of edges connecting leaves to 
their parent. Now, by assumption, 0 < < 1, t = 1, 2, 3, 1 < Di + D2 < 2, and 

1.5 < Di + D2 + Ds < 2 {Di +D2 + D3 > 1.5 follows since otherwise the sum of 
some two demands among Di, D2 and is less than or equal to 1, and these 
two demands are mergeable, a contradiction). Here, without loss of generality 
we assume w\ > W2 > W3. Similarly, let v{,V2 and v'^ denote three leaves of the 
subtree T ^/ , and define symbols D [ , D'2 , D'^ and w[,W2, w'^ in the same manner 
as for Tx- Similarly, by assumption, 0 < < 1, i = 1, 2, 3, 1 < D[ + D'2 < 2, 

and 1.5 < D'^ + D'2 + D'^ < 2 hold, and w'i> w'2> w'^ is assumed. 

Here we prepare only one strategy that allocates two vehicles for each of 
subtrees and T^' to serve the demands of each of these subtrees; For T^, the 
first vehicle serves the demand at v\ and the partial demand at V3 so that the 
sum of demands is equal to 1, and the second vehicle serves the demand at V2 
and the remaining demand at V3. Similarly to T^, we schedule two vehicles to 
serve demands of T ^’ . 

The cost required by these four vehicles is given by 

Set 46 T 46^ T 2wi -t- 2w2 4rc3 -t- 2w'^ -t- 2w'2 ^w'^. 

The decrease of the lower bound is given by 

6n -\- 46 -\- Ah' -t- 2wi -t- 2w2 2w^ -t- 2w'^ -t- 2w'2 2w'^. 

The first term 6a comes from the fact that for every edge e on the path from 
the root r to u, the decrease of the lower bound LB{e) is at least 6w{e) and at 
most 8w{e) from (2) because the decrease of D{Ty,) is between 3 and 4. Thus, 
the ratio of the cost of the tours to the decreased lower bound is given by 

8a 46 Ab' -t- 2wi -t- 2w2 Aw^ -t- 2w'.^ -t- 2x^2 4x^3 , . 

6a T 46 A~ Ab' -t- 2xci -t- 2xc2 A~ 2w^ -t- 2xc^ -t- 2x^2 2w'^ 

From Wi > W2 > XXI3 and w'l > w'2 > w'^, we can show that ri is bounded by 
4/3, using Facts 1 and 2: 

8a + 2xci + 2 xx >2 + 4xa3 + 2xc) + 2w'2 + Aw'., 8a + 8x^3 + 8xCo 4 

^ 7 7 r < ^ T = -■ ( 4 ) 

6a + 2xci + 2xx>2 + 2w3 + 2w[ + 2w2 + 2w^ 6a + 6x^3 + 6w^ 3 



Lemma 2. In Case 1, the approximation ratio is at most 4/3. 



Case 2: Suppose that a q-node u has only one p-node v having three leaves x>i, x ;2 
and x>3 together with a single leaf V4. As before, we denote the demand and edge 
weight associated with Vi by Di and Wi, respectively, and w\ > W2 > W3 is 
assumed. We denote the path length from the root to u by a, and the weight of 
the edge between the u and x; by 6. The algorithm prepares different strategies 
depending on whether 

Di D 2 D^ < 2 



holds or not. 



Subcase 2A: Di + D2 + D4 < 2 holds. We prepare the following four strategies. 
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Strategy 1: This strategy allocates three vehicles. The first vehicle serves the 
full demand at v\ and partial demand at to the full capacity. The second one 
serves the full demand at V2 and the remaining demand at (which is less than 
1). The third one serves the demand of V4. Then, the ratio is given by 

6a + 46 + 2W + 2w3 

“ 4a + 46 + 2TT ’ ^ ^ 



where W = wi + W2 + + W4. 

Strategy 2: This strategy is the same as Strategy 1 except that the roles of V3 
and V4 are exchanged. The ratio is given by 

6a + 46 + 21T + 2ui4 , . 

= 4a + 4l, + 2ir ■ 



Strategy 3: This strategy allocates two vehicles, (1) to serve the full demand at 
Vi and partial demand at V3 to fill the capacity, and (2) to serve the full demand 
at V2 and the remaining demand at V3 and moreover partial demand at V4. This 
is possible since Di + D2 + D3 < 2. The remaining demand at V4 is left for the 
next round. Then, the ratio is defined by 

_ 4o + 46 + 2W + 2w3 

~ 4o + 46 + 21T - 2w4 ' ^ ’ 



Strategy 4: This strategy allocates two vehicles, (1) to serve the full demand at 
vi and partial demand at V2 to fill the capacity, and (2) to serve the full demand 
at V4 and the remaining demand at V2 and moreover partial demand at V3. This 
is possible since Di + D2 + D4 < 2. The remaining demand at V3 is left for the 
next round. Then, the ratio r4 is defined by 

4a + 46 + 21T + 2ui2 

“ 4a + 26 + 21T - 2^3 ' ^ ’ 

The smallest value among the above four values is evaluated by the following 
case analysis. 



(i) W3 > W4. We choose the better one between Strategies 2 and 3. From W3 > 1V4, 
we have 



60 + 46 + 10w4 6a + 10i(;4 

i"2 < 71 7^ < 7^ , and 

4 a + 46 + 81U4 4 a + 8104 



T3 < 



4a + 10w4 
4a + 6 w 4 



( 9 ) 



By letting x 



ajw4, we compute 
max min{ 

X 



6x + 10 4a: + 10 
4a: + 8 ’ 4a; + 6 



}• 



( 10 ) 



The maximum is attained when holds, i.e., x = ~ 

4rc+8 4rc+6 ’ ’ 4 

1.35078. The maximum value of l|1 OH is also ~ 1.35078. 

(ii) W2 > W4 > W3. We choose the better one between Strategies 1 and 3. The 
analysis is done in the same way as Case 1. 



(iii) W4> W2- We choose the better one between Strategies 1 and 4. 



6a + 46 + 2W + 2ws ^ 6a + 46 + 2wi + 6w2 + 2w4 ^ 6a + 46 + IOW4 
4a + 46 + 2W ~ 4a + 46 + 2'u;i + A1V2 + 2.W4 ~ 4a + 46 + 81U4 
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4a + 46 + 2W + 2 w 2 ^ 4a + 46 + 2wi + 6 w 2 + 2 w 4 ^ 4a + 46 + 10w4 
4a 26 2W — 2w^ 4a -t- 26 -t- 2wi -t- 2x^2 ~t“ 2x^4 4a -t- 26 -t- 6x1^4 



By letting x = a/w 4 , y = h/w 4 ^ we have 



min{ri,r4} < min{ 



3x + 2x/ + 5 2x + 2y + b 
2cc + 2x/ + 4 ’ 2a; + y + 3 



}• 



( 11 ) 



Letting f{x, y) be the term on the right-hand side of (II It . we compute maxa;_y>o 
/(a:, y) using the known theorem for generalized fractional program For this, 
we consider the following parametric problem: 



z(A) = maxmin{3a:-|- 2x/-|- 5 — A(2a: -I- 2x/ -|- 4) , 2a; -I- 2y -|- 5 — A(2x-|- x/-l- 3)}. (12) 
Letting 

fi(x,y) = 3x + 2y + 5-X{2x + 2y + 4), f 2 (x,y) = 2x + 2y + 5- X(2x + y + 3), 
the minimum of z(A) is attained when fi{x,y) = f 2 {x,y), i.e., x = Ay-1- A. Then 
substituting x = A(y -I- 1) into fi{x, y) we have 

/i(x, y) = (-2A2 + a + 2)y + 5 - a - 2A2. (13) 



When -2A2 -h A -h 2 > 0, i.e., A < (1 -h \/l7)/4 ~ 1.28, 5 - A - 2A^ > 0 always 
holds for any y > 0. Thus, fi{x, y) = 0 does not occur when — 2A^ -I- A -I- 2 > 0. 
When — 2A^ -I- A -I- 2 < 0, maximum of m is attained at y = 0, and the value of 
/i(x,y) at y = 0 is 5 — A — 2A^. Thus, when 5 — A — 2A^ = 0, i.e., A = 1.35078, 
z(A) = maxj^^y min{/i(x, y), /2(x, y)} = 0 holds. Therefore, the approximation 
ratio is 1.35078 in this case. Summarizing the analysis made in (i), (ii) and (hi), 
we have the following lemma. 

Lemma 3. In Subcase 2A, the approximation ratio is at most 1.35078. 

The proof for the Subcase 2B (Di + D 2 + D 4 > 2) proceeds similarly. 
Lemma 4. In Subcase 2B, the approximation ratio is at most 1.35078. 



Subcase 3A: A q-node xt has three leaves. Let vi,V 2 , and V 3 be those three 
leaves. We denote by Di,D 2 and D 3 their demands, and by wi,W 2 , and W 3 the 
edge weights. The weight of the unique path from the root r to xt is denoted by 
a. This case can be treated as a special case of Subcase 2B in which XI4 = V 3 , 
6=0, D 3 = 0, and XXI3 = 0 while Vi = Vi, Di = Di and Wi = Wi for x = 1, 2. 
Subcase 3B: A q-node xt has four leaves. This case can be viewed as a special 
case of Case 2 in which the length of the path from xt to x> is equal to 0. Thus, 
it will be treated in Case 2. 

Finally, we shall turn our attention to the base case where there is no q-node 
in the input tree, that is, the total sum of the demands is less than 2. Applying 
the reforming operations to the tree results in a simple tree of the three forms 
depending on the number of leaves: (a) the root r has only one leaf, (b) r has 
two leaves, and (c) r has one child u which has three leaves. 

In case (a), the remaining demand is less than one, and thus is served opti- 
mally by a single vehicle. In case (b), we allocate two vehicles, one for each leaf. 



326 



T. Asano, N. Katoh, and K. Kawashima 



It is easy to see that this strategy is also optimal. Now let us consider case (c) 
in which a p-node u has three leaves V\,V 2 -, and W 3 . Symbols Di, Wi, and a are 
used as before. We also assume that w\ > W 2 > w^. We allocate two vehicles; 
one to serve the demands D\ and a fraction of so as to fill the full capacity 
of the vehicle, and another to serve D 2 and the remaining demand of D^. Then, 
the ratio is given by 



4a + 2u,, + 2u., + to, ^ 4a^ ^ ^ ^ 

4a + 2w\ + 2w2 + 2 w 3 4a + 6 W 3 



(14) 



Lemma 5. In the base case, the approximation ratio is at most 1.35078. 



5 Conclusions 

We have presented a new approximation algorithm for finding optimal tours to 
serve demands located at nodes of a tree-shaped network. Our new algorithm 
establishes the approximation ratio 1.35078 (exactly, (-x/id — l)/4). This ratio 
seems to be almost best possible since there is an instance of TREE-CVRP for 
which the cost of an optimal solution is asymptotically 4/3 times larger than 
the lower bound of the cost. To have better ratio we have to improve the lower 
bound, which is left for future research. 
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Abstract. Cellular networks are generally modeled as node- weighted 
graphs, where the nodes represent cells and the edges represent the pos- 
sibility of radio interference. An algorithm for the channel assignment 
problem must assign as many channels as the weight indicates to every 
node, such that any two channels assigned to the same node satisfy the 
co-site constraint, and any two channels assigned to adjacent nodes sa- 
tisfy the inter- site constraint. 

We describe several approximation algorithms for channel assignment 
with arbitrary co-site and inter-site constraints and reuse distance 2 for 
odd cycles and so-called hexagon graphs that are often used to model 
cellular networks. The algorithms given for odd cycles are optimal for 
some values of constraints, and have performance ratio at most l-|-l/(n — 
1) for all other cases, where n is the length of the cycle. Our main result 
is an algorithm of performance ratio at most 4/3 -I- e for hexagon graphs 
with arbitrary co-site and inter-site constraints. 



1 Introduction 

The demand for wireless telephony and other services has been growing drama- 
tically over the last decade. As a result of this, radio spectrum resources are 
scarce, and their efficient use becomes of critical importance. The cellular con- 
cept was proposed as an early solution to the problem of spectrum congestion. 
By dividing the service area into small coverage areas called cells served by low 
power transmitters, it became possible to reuse the same frequencies in different 
cells, provided they are far enough apart. With growing demand, it is neces- 
sary to perform this reuse as efficiently as possible, while ensuring that radio 
interference is at acceptable levels. 

Cellular networks are generally modeled as node- weighted graphs, where the 
nodes represent the cells, and the edges represent the possibility of radio fre- 
quency interference. The weight on a node represents the number of calls ori- 
ginating in the cell represented by the node. The base station in a cell must 
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assign frequency channels to each call originating in the cell. However, this as- 
signment of channels must satisfy certain interference constraints. In particular, 
channels that are too close together may interfere with each other when they are 
assigned to calls that originate in the same or adjacent cells. These interference 
constraints can be represented by a set of integers cq > ci > C 2 , . . ., where Ci 
is the minimum separation required between two channels assigned to calls in 
cells that are distance % apart in the network. The parameter cg which is the 
minimum gap between two channels assigned to the same cell is called the co- 
site constraint and the other constraints are called inter-site constraints. The 
minimum graph distance between nodes that can be assigned the same channel 
is called the reuse distance. In order to optimize the use of the spectrum, the 
objective of the channel assignment algorithm is to minimize the span of the 
assignment, that is, the difference between the largest numbered channel used 
and the smallest channel used. 

When Co = Cl = 1, and Ci = 0 for alH > 1, the problem reduces to the mul- 
ticolouring problem, which has been widely studied. The problem is NP-hard 
for many classes of graphs, including the so-called hexagon graphs, which have 
traditionally been used to represent cellular networks |MR97j . Hexagon graphs 
are subgraphs of the triangular lattice (see Figure [T|. They are particularly rele- 
vant to channel assignment, since they represent a regular cellular layout where 
the cells are hexagonal, and interference only occurs between neighboring cells. 
Optimal solutions for this restricted channel assignment problem are possible for 
some classes of graphs including complete graphs, bipartite graphs, odd cycles, 
and outerplanar graphs |NS97] . When the chromatic number of the underlying 
graph is k, an approximation algorithm with a performance ratio of k/2 has been 
shown IJK95I . For hexagon graphs, approximation algorithms with performance 
ratio of 4/3 are known jNS97jMI?97] . For the case cg = ci = C2 = 1, and Ci = 0 
for all i > 2, an approximation algorithm with performance ratio 7/3 is given in 

F5M1 . 

In this paper, we study the case of reuse distance 2, but arbitrary co-site and 
inter-site constraints. More precisely, we study the case cg > ci and Ci = 0 for 
alH > 1. Thus channels assigned to the same cell must differ by at least cg and 
those assigned to adjacent cells must differ by at least Ci. For the case Ci = 1, 
Schabanel et al. [SUZ97j give a 4/3 approximation algorithm for hexagon graphs. 
For arbitrary cg and Ci, a tight bound for cliques was given by Gamst |Gam86| . 
and an optimal algorithm for bipartite graphs was given by Gerke |Ger99j . 

We give the first known algorithms with provable performance guarantees for 
channel assignment with arbitrary constraints in odd cycles and hexagon graphs. 
The performance of our algorithms is evaluated using known lower bounds based 
on the maximum weight on a node, the total weight and the maximal number 
of nodes that can receive the same channel, or the weights on a clique and their 
distribution. We first show six simple algorithms for bipartite graphs, odd cycles, 
and 3-colourable graphs. Using these as building blocks, we derive an optimal 
algorithm for odd cycles when cg > where n is the length of the cycle. For 

the case where cg < we give near-optimal algorithms with performance 

ratio at most 1 J- . 
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For hexagon graphs, we give approximation algorithms with performance 
ratio at most 4/3, when ci < cq < 2ci, and 9ci/4 < cq < 3ci. For the interme- 
diate case 2ci < Cg < 9ci/4, the performance ratio of the algorithm is less than 
4/3 + 1/100. There is a straightforward optimal algorithm for hexagon graphs 
when Cg > 3ci. Thus for arbitrary co-site and inter-site constraints, our algo- 
rithms nearly match the performance of the best known algorithm for the case 
when co-site and inter-site constraints are both exactly equal to 1. 




Fig. 1. A hexagon graph and a 3-colouring of the nodes of the graph. The hexagonal 
area around each node represents the calling area it serves. 



The rest of the paper is organized as follows. We define the problem for- 
mally in Section |2l We give simple (but not necessarily optimal) algorithms for 
channel assignment in bipartite graphs, odd cycles, and hexagon graphs in Sec- 
tion 1^ Near-optimal algorithms for odd cycles and approximation algorithms for 
hexagon graphs are then given in in Section]^ 



2 Preliminaries 

For the basic definitions of graph theory we refer to [13M76j . A stable set in a 
graph is a set of nodes of which no pair is adjacent. A clique in a graph is a set 
of nodes of which every pair is adjacent. 

A constrained graph G = {V,E,cq,ci) is a graph G = (V,E) and posi- 
tive integer parameters Cg and ci representing the reuse differences prescribed 
between pairs of channels assigned to the same node and adjacent nodes, re- 
spectively. A constrained, weighted graph is a pair (G, w) where G is a 
constrained graph and w is a positive integral weight vector indexed by the 
nodes of G. The component of w corresponding to node u is denoted by w(u) 
and called the weight of node u. The weight of node u represents the number of 
calls to be serviced at node u. We use Wmax to denote max{w{v) \ v € V} and 
Wmin to denote the corresponding minimum weight of any node in the graph. 
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A channel assignment for a constrained, weighted graph (G, w) where 
G = {V, E, Co, Cl) is an assignment / of sets of non-negative integers (which will 
represent the channels) to the nodes of G which satisfies the conditions: 

\f{u)\ = w{u) (u€V), 

i e f{u) and j G f{v) \i - j\ > ci {{u,v) G E,Uy^ v), 

hj € f{u) and i ^ j ^ \i - j\ > cq {u G V). 

The span S{f) of a channel assignment / of a constrained weighted graph 
is the difference between the lowest and the highest channel assigned by /, 
in other words, S{f) = maxf{V) — where f{V) = Uuey /('“)■ The 

span S{G,w) of a constrained, weighted graph G and a positive integer vec- 
tor w indexed by the nodes of G is the minimum span of any channel assig- 
nment for (G, ui). We use x(G,rc) to denote the minimal number of channels 
needed for an assignment of the weighted, unconstrained graph G. Note that 
x{{V, E, Co, Cl), ic) = 5'((F, E, 1, 1), w) -I- 1, where the additive term is due to the 
fact that k consecutive channels have a span of fc — 1. The spectrum used by 
a channel assignment algorithm is the interval of channels between the highest 
and lowest channels assigned. 

A channel assignment / is said to be optimal for a weighted constrained 
graph G if S{f) = S{G, w) -1-0(1). Here we consider the span to be a function of 
the weights and the size of the graph, so the 0(1) term can include terms that 
depend on the constraints cq and ci. An approximation algorithm for channel 
assignment has performance ratio k when the span of the assignment produced 
by the algorithm on {G,w) is at most kS{G,w) + 0(1). 

The following lower bounds will be used to evaluate our algorithms and 
calculate the performance ratio. The first bound derives from the fact that any 
two channels on the same node must be at least Cq apart. The next two bounds 
are based on weights and their distribution on cliques in the graph, and are 
derived from a bound for cliques given by Gamst [Gam86j . The last two bounds 
use the fact that, because of the inter-site constraint, all nodes that receive 
channels from any particular channel interval of length ci must form a stable 
set. 

Theorem 1 (Known lower bounds). 

Let G = (y, G, Cq, Cl) be a eonstrained graph, and w G a weight vector 
for G. Then 



S{G, w) > CoWmax ~ Cq (1) 

S{G, w) > max{cow{u) + (2ci — Co)w(u)|(m, v) G E} — cq when Cq < 2ci(2) 
S{G, w) > max{cow{u) + (2ci — co)(w(v) + ■ic(t))| {u, v, t} a clique} — cq 

when Co < 2ci (3) 

S{G,w) > Cimax{w{u) + w{v) + w(t)\ {u,v,t} a clique} — C\ (4) 



S{G,w) > w{v) - Cl 

n — 1 ^ ' 

v£V 



when G is an odd cycle of length n 



(5) 
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3 Basic Algorithms for Channel Assignment 

In this section, we provide six simple algorithms for channel assignment in spe- 
cific situations. The first two, Algorithms A and B, are optimal algorithms for 
bipartite graphs for the cases cq > 2ci and Ci < cq < 2ci respectively. Note 
that essentially the same algorithms are given and proved to be exactly optimal 
(without a constant additive term) in [Cer99j . We give them here for completen- 
ess, as we use these algorithms to prove further results in the next section. Also, 
our exposition is simpler as we ignore constant additive terms in our definition 
of optimality. Due to lack of space, we omit the proofs of correctness of these 
algorithms; they can be found in [,TN99j . 

Algorithms C and D both perform channel assignment for odd cycles. Neither 
of these algorithms is optimal, but we use them in the next section in combination 
with other algorithms to obtain optimal and near-optimal bounds for odd cycles. 
It is easy to check that all the algorithms given in this section have linear-time 
implementations . 

Algorithms E and F are for 3-colourable graphs. While Algorithm E’s perfor- 
mance is always at least as good as that of Algorithm F, the latter has some room 
for channel borrowing. Thus when used in combination with other algorithms, 
it can have an advantage over Algorithm E. We will use these two algorithms 
combined with modification techniques and with the algorithms for bipartite 
graphs, to derive near-optimal algorithms for hexagon graphs. 



Algorithm A (for bipartite graphs when Cq > 2ci) 

Let G = {V, E,co,c\) be a constrained bipartite graph of n nodes, where cq > 
2ci and w an arbitrary weight vector. Let each node be coloured red or green 
according to the bipartition. Red nodes use as many colours as necessary from 
the set 0, Cq, 2cq, . . . , {wmax — l)co- Green nodes use as many colours as necessary 
from the set ci, cq -I- Ci, . . . , {wmax — l)co + ci. It is easy to see that the span of 
the assignment is no more than c^Wmax — c\. 



Algorithm B (for bipartite graphs when Ci < Cq < 2ci) 

Let G = (E, E, co,Ci) be a constrained bipartite graph of n nodes, where c\ < 
Co ^ 2ci, and w an arbitrary weight vector. Let each node be coloured red or 
green according to the bipartition. 

Given a node v, define p{v) = max{w{u) \ (u,v) € E}. The general idea 
is that red nodes always get channels starting from 0 and the green nodes get 
channels starting from ci . If a node has demand greater than any of its neighbors 
then it initially gets some channels that are 2ci apart (in order to allow inter- 
spersing the channels of its neighbors) and the remaining distance cq apart. A 
more precise description can be found in IJJN99I . It is not difficult to see that the 
span of the assignment above is at most max(^u,v)sE {cqw(u) -I- (2ci — cq)w(u)} 
(see [IGer99J for a complete explanation). 
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Algorithm C (for odd cycles) 

Let G = (V, E, Co, ci) be a constrained cycle of n nodes, where n > 3 is odd, and 
w be an arbitrary weight vector. Fix c = max{co, c*} where c* = 2ncij{n — 1). 
For convenience, the nodes of the cycle are {1, . . . n}, numbered in cyclic order, 
where node 1 is a node of maximum weight in the cycle. 

The algorithm is based on an initial basic assignment of one channel per 
node. Additional channels are then given to each node by adding the appro- 
priate number of multiples of c to the basic assigned channel of the node. The 
basic assignment uses a spectrum [0, c]. Initially, it will proceed by assigning the 
channel obtained by adding Ci (modulo c) to the previously assigned channel to 
the next node in the cycle. At a certain point, it will switch to an alternating 
assignment. 

More precisely, let to > 1 be the smallest odd integer such that c > 

Since c > c*, a value of to < n satisfying this must exist, and to is well-defined. 
Note that the definition of to implies that c < Let b be the basic 

assignment assigned as follows: 



To each node i, the algorithm assigns the channels b{i) + jc, where j = 
- 1 . 

Span of the assignment: The span of the channel assignment described here 
is at most WmaxWaxjco, c*} where c* = 2nci/(n — 1)). 

Algorithm D (for odd cycles) 

Let G = (M, E, Co, ci) be a constrained cycle of n nodes, where n > 3 is odd, and 
w be an arbitrary weight vector. We state the following fact about x{G,w). 

Fact 2 For G an odd cycle of n nodes, 

x{G, w) = max{2Ey^vw{v) / {n — l),max{w{u) + w(u)| (m, v) € E}}. 

This algorithm is a straightforward adaptation of the optimal algorithm for 
multicolouring an odd cycle (without constraints) given in |NSh7j . 

Fix c = TOaa:{co, 2ci}, and u> > max{x{G,w),2wmax}- We use the spectrum 
[0, . . . , c|’w/2]], and describe a fixed sequence of channels to be used from this 
spectrum in that order by the multicolouring algorithm. The sequence is: 



It is straightforward to verify that there are exactly ix channels in this se- 
quence. We now proceed as for multicolouring, using the sequence given here to 
assign channels rather than a continuous part of the spectrum. 




(i — l)ci mod c when 1 < t < to, 

0 when i > m and i is even, 

(to — l)ci mod c when i > m and i is odd. 
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Precisely, let k be the smallest integer such that < ku>. Nodes 1 

through 2k are assigned contiguous channels in a cyclic manner from the spec- 
trum given. Specifically, for 1 < j < 2k, node j is assigned the £-th through 
m-th channel of the spectrum, where ^ = 1 -1- X^i=i ^(*) ^ ~ Si=i w{i). 

This assignment is done cyclically, so if ^ > m, then the channels ‘wrap around’ 
from the ^-th channel to the end of the spectrum, and back from the beginning 
of the spectrum to the m-th channel. The assignment for nodes 2fc -|- 1 through 
n is based on their parity; for 2fc -|- 1 < i < n, node i is assigned the first w{i) 
channels of the spectrum if i is even, or the last w{i) channels if i is odd. 

Span of the assignment: The span of the assignment is maa;{co, 2ci} |"w/2] 
where u> is any integer which is at least max{2wmax,x{C!,w)}. 



Algorithm E (for 3-colourable graphs) 

Let G = (V, E, cq, Cl) be a 3-colourable graph, and w be an arbitrary weight 
vector. Fix c = maa;{3ci, cq}. 

This algorithm uses a colouring of the nodes of the graph with colours 
red, blue and green. It assigns at most w^ax channels to each node. For i = 
0, . . . , Wmax — 1, the channels ic are reserved for red nodes, the channels ic + c\ 
are reserved for green nodes, and the channels tc-|-2ci are reserved for blue nodes. 
Each node v is assigned w{v) channels from its own reserved sets of channels. 

Span of the assignment: This algorithm produces an assignment of span at 
most CWmax ~ Cl = max{Sci, Co}Wmax ~ Cl. 



Algorithm F (for 3-colourable graphs) 

Let G = (V, E,cq,ci) be a 3-colourable graph, and w be an arbitrary weight 
vector. Fix c = max{ci, cq/2} and T > Swmax- 

We use a spectrum of T channels, with consecutive channels separated by c, 
where channels reserved for different colours are interspersed. (This alternation 
of channels was first used in |SUZ97| .l We assume for ease of explanation that 
T is a multiple of 6. Precisely, the red channels consist of a first set Ri = 
[0, 2c, . . . , (T/3 — 2)c] and a second set i ?2 = [(T /3 -I- l)c -I- cq, (T /3 -I- 3)c -|- 
co, . . . , (2T/3 — l)c -I- Co]. The blue channels consist of first set Bi = [(T/3)c -I- 
co, (T/3 -I- 2)c -I- Co, ... , (2T/3 — 2)c -I- co] and second set B 2 = [(2T/3 -I- l)c -|- 
2co, (2T/3-|-3)c-|-2co, . . . , (T— l)c-|-2co], and the green channels consist of first 
set Gi = [(2T/3)c -I- 2cq, (2T/3 -I- 2)c -I- 2co, . . . , (T — 2)c -I- 2co] and second set 
G 2 = [c, 3c, . . . , (T/3 — l)c] . Thus, we can think of the spectrum as being divided 
into three parts, each containing T /3 channels (with a gap of at least cq between 
the parts). 

Each node v is assigned w(v) channels from those of its colour class, where 
the first set is exhausted before starting on the second set, and lowest numbered 
channels are always used first within each set. 

Span of the assignment: The span equals cT -|- 2co = maxjci, co/2}T -|- 2co, 
where T is at least Swmax- 
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4 Approximation Algorithms for Odd Cycles and 
Hexagon Graphs 

In this section, we will outline how variations and combinations of the algorithms 
described in Section |3]can be used to derive optimal and near-optimal algorithms 
for channel assignment in odd cycles and hexagon graphs. 

Our result for odd cycles is given in the following theorem: 

Theorem 3. For any G = {V,E,cq^c\) a constrained odd cycle of length n, 
and w an arbitrary weight vector, there is a linear time algorithm for channel 
assignment in (G,w) with performance ratio 

{ 1 when cq > c* = 2nc\j{n — 1) (that is, an optimal algorithm), 

1-t-l/n when 2ci < Cq < c* = 2nci / {n — 1) 

1 -I- l/(n — 1) when Co < 2ci . 

Proof When cq > c*, Algorithm C is optimal. When 2ci < Cq < c*, we use a 
combination of Algorithms A, C, and D to obtain the result. For the remaining 
case, we use a combination of Algorithms B and C. Details and proofs can be 
found in [JN99] . 

Next, we describe approximation algorithms for channel assignment with 
constraints in hexagon graphs. The algorithms we describe use a standard 3- 
colouring of hexagon graphs, which gives a partition of the nodes into red, blue, 
and green nodes (see Figure [TJ. The first two theorems are based on Algorithm 
E and give results for the cases cq > 3ci, where the algorithm is optimal, and 
for Cl < Cq < 3ci. The last two theorems use a combination of the algorithms 
given in Section 3 with additional modifications, and deal with the cases where 
2ci < Co < (9/4) Cl and ci < cq < 2ci, respectively. 

Theorem 4. For any cq > 3ci, G = (V,B,co,Ci) a constrained hexagon graph, 
and w an arbitrary weight vector, there is an optimal linear time approximation 
algorithm for channel assignment in {G, w) . 

Proof. Since cq > 3ci, Algorithm E gives an assignment of span at most coWmax, 
and it follows from lower bound 0 of Theorem [T] that this is an optimal assig- 
nment. 

Theorem 5. For any ci < cq < 3ci, G = (V, E, cq, ci) a constrained hexagon 
graph, and w an arbitrary weight vector, there is a linear time approximation 
algorithm for channel assignment in (G,w) that has performance ratio 3ci/cq. 

Proof. Since 3ci > cq, Algorithm E gives an assignment of span at most SciWmax- 
By lower bound ((I|), S{G,w) > CoWmax ~ cq, so this span is at most 
(3ci/co)5'(G, w) -I- 0(1), as claimed. 

Note that when cq > (9/4) ci, the performance ratio of the above algorithm 
is at most 4/3. In the following two theorems, we give algorithms that improve 
on the performance ratio given in Theorem El for values of Cq < (9/4)ci. 
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The two remaining algorithms in this section use the same type of strategy. 
First, each node is assigned enough channels from those assigned to its colour 
class to guarantee that there are no triangles left in the graph. Next, each node 
borrows any available channels of a designated borrowing colour. The resulting 
graph is then shown to be a bipartite graph, for which an optimal channel 
assignment is found. (This general approach was first used for multicolouring 
of hexagon graphs in [INS97J and |MR97] .') The algorithms differ in the initial 
separation of channels into different colour classes. The algorithm of Theorem |7] 
uses an additional technique of squeezing channels when possible. Both borrowing 
and squeezing make use of already assigned parts of the spectrum, and thus do 
not add to the total span of the assignment. In the following, let D represent 
the maximum of the sum of weights on any clique in the graph. 

Theorem 6. For any 2c\ < Cq < (9/4)ci, G = (R, if, Cq, Ci) a eonstrained 
hexagon graph, and w an arbitrary weight vector, there is a linear time approxi- 
mation algorithm for channel assignment in {G, w) that has performance ratio 

1 I q Cq — 2ci _|_ 9ci— 4 cq 
Cq 3ci 

Proof. The algorithm proceeds in four phases. The first two phases assign a total 
of D channels, partly according to Algorithm E (Phase 1) and partly according 
to Algorithm F (Phase 2). The next phase is a borrowing phase in which nodes 
borrow any available channels of the borrowing colour, i.e. any channels that 
are unused by all of its neighbors of the borrowing colour. The resulting graph 
is shown to be a bipartite graph, for which an assignment is found in Phase 4 
using Algorithm A. The details of the phases and the proof of correctness and 
span are omitted here due to lack of space, and can be found in |JN99J . 

The above theorem yields an algorithm with performance ratio that is always 
less than 4/3+1/100. In particular, the maximum value of the performance ratio 
is reached when cq/ci = 3/\/2. When cq = 2ci or cq = 9ci/4, the performance 
ratio is exactly 4/3. 

Theorem 7. For any cq < 2ci, G = (V, E, cq , ci) a constrained hexagon graph, 
and w an arbitrary weight vector, there is a linear time approximation algorithm 
for channel assignment in {G,w) that has performance ratio 4/3. 

Proof. The algorithm consists of four phases. First Algorithm F is used with 
a spectrum of D channels. In Phase 2, nodes try to borrow available channels 
of the borrowing colour. In the next phase, some of the channels assigned are 
“squeezed” closer together where possible to accommodate more channels. The 
remaining graph is then bipartite, and Algorithm B is used. The details of the 
phases and the proof of correctness and span are omitted here due to lack of 
space, and can be found in |JN99| . 

Table [U below summarizes the results of this section. Note that the given 
values are upper bounds; as cq approaches 3ci , the performance ratio approaches 
1 , and similarly, the performance ratio approaches 4/3 at both ends of the range 
2ci < Co < 9ci/4. 
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Cl ^ Co ^ 2ci 


2ci < Co < 9ci/4 


9ci /4 < Co < 3ci 


Co > 3ci 


Perf. Ratio 


4/3 


4/3 4-1/100 


4/3 


1 



Table 1. Upper bounds on performance ratio on hexagon graphs for different values 
of Co and ci. 



5 Conclusions 

We described new algorithms for channel assignment with arbitrary co-site and 
inter-site constraints on odd cycles and hexagon graphs. For odd cycles, our 
algorithms are optimal or near-optimal. We conjecture that in the cases where 
we do not achieve optimality, the existing lower bounds for odd cycles are in- 
adequate. For hexagon graphs, for the case cq < 3ci, we give approximation 
algorithms with performance ratios of at most 4/3 -I- e (when cq > 3ci, there is a 
straightforward optimal algorithm). We point out that this matches the perfor- 
mance ratio of the best known algorithm for multicolouring on hexagon graphs, 
for most values in this range, and is only greater by a very small factor for all 
values (see Table [TJ . 

Our algorithms are centralized and static algorithms. However, in practice, 
channel allocation is a distributed and online task; future work will involve the 
investigation of efficient online and distributed algorithms for this problem. 
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Abstract. Let G = {V,E) be a plane graph with nonnegative edge 
lengths, and let A/” be a family of k vertex sets Ni,N 2 ,---,Nk C V, 
called nets. Then a noncrossing Steiner forest for Af in G is a set E of k 
trees Ti,T 2 , ■ ■ ■ ,Tk in G such that each tree Ti £T connects all vertices 
in Ni, any two trees in T do not cross each other, and the sum of edge 
lengths of all trees is minimum. In this paper we give an algorithm to 
find a noncrossing Steiner forest in a plane graph G for the case where 
all vertices in nets lie on two of the face boundaries of G. The algorithm 
takes time 0(n log n) if G has n vertices. 



1 Introduction 

The Steiner tree problem is to find a minimum tree connecting a “net,” i.e. a 
set of “pins,” and often appears in the routing of VLSI layouts. One may often 
wish to find Steiner trees for many nets instead of a single net. Any two of the 
trees must not cross each other in the single layer VLSI layout due to a physical 
condition. Algorithms to solve such problems are desired for automatic routing 
of VLSI layouts wm- 

One of such problems is finding, in a given plane graph with k terminal-pairs, 
a set of k noncrossing paths such that any two of them do not cross each other 
and the sum of their weights is minimum. This problem is a natural model for 
the single layer VLSI routing of “two-terminals nets,” each of which contains 
exactly two pins. One may regard edges of a graph as routing regions, paths as 
routes, and each terminal pair of a path as a pair of pins connected by a route. 
One may wish to find a set of noncrossing paths the total weight of which is 
minimum. If all terminals are on at most two of the face boundaries of a plane 
graph, then such a set of k noncrossing paths can be found in time O(nlogn) 
for any A: > 1, where n is the number of vertices in a plane graph [Uj. 

Another well-known problem is finding a Steiner tree on a plane graph 
m m\- Although finding a Steiner tree on a plane graph is very useful for 
VLSI routing, the problem is A^P-complete [1] and it seems that there exists 
no efficient algorithm to solve the problem. However, if all terminals lie on a 
single face boundary of a plane graph, a Steiner tree can be found in time 
0{Pn + Pnlogn) where I is the number of terminals HE]. 



A. Aggarwal, C. Pandu Rangan (Eds.): ISAAC’99, LNCS 1741, pp. 337- 134^ 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 
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In this paper, we deal with a problem extended from the two problems above 
and defined as follows. For a plane graph G = (V,E) with nonnegative edge 
lengths and a family J\f oik vertex sets Ni, N 2 , ■ ■ ■ , N)~, called nets, find a set T 
of k trees Ti, T 2 , • • • , Tk, called a noncrossing Steiner forest, such that 

— for each Ni G Af, tree Ti gT connects all terminals in Ni, 

— any two trees in T do not cross each other (a formal definition of crossing 
trees will be given in Section [^; and 

— the total weight EE w{e) of the forest T is minimum, where w is a 

TieTeeTi 

weight function from E to nonnegative real numbers. 

An example of a noncrossing Steiner forest is depicted in Fig.[Tl where k = 7, 
all terminals in net Ni, 1 < z < A:, are labeled by i, and w{e) = 1 for every edge 

e. 




Fig. 1. A noncrossing Steiner forest. 



If every net contains exactly two terminals, then a noncrossing Steiner fo- 
rest is merely a set of noncrossing paths such that the sum of their lengths is 
minimum. On the other hand, if fc = 1, that is, there is exactly one net, then a 
noncrossing Steiner forest is merely an ordinary Steiner tree. Since the Steiner 
tree problem is N P-complete for plane graphs |4], the noncrossing Steiner forest 
problem is fVP-complete in general for plane graphs. However, some restrictions 
for the location of terminals may contribute to the construction of an efficient 
algorithm. In this paper we show that it is indeed the case. That is, we give an 
efficient algorithm for finding a noncrossing Steiner forest in a plane graph if all 
terminals are on at most two of the face boundaries as in Fig. [T] Our algorithm 
takes 0(n log n) time if there are n vertices in a graph. 

This paper is organized as follows. In Section [21 we give some preliminary 
definitions. In Section |31 we give an algorithm for finding a noncrossing Steiner 
forest in a plane graph for the case where all terminals are on a single face 
boundary. In Section |31 we give an algorithm for finding a noncrossing Steiner 
forest for the case where all terminals are on at most two of the face boundaries. 
Finally, we conclude in Section |2] 



Algorithms for Finding Noncrossing Steiner Forests in Plane Graphs 339 



2 Preliminaries 

In this section we define terms and formally describe a noncrossing Steiner forest 
problem. 

We denote by G = {V,E) a graph consisting of a vertex set V and an edge 
set E. We denote by V{G) and E{G) the vertex set and the edge set of G, res- 
pectively. Let n be the number of vertices in G. Assume that G is an undirected 
plane connected graph and that every edge e € E has a nonnegative weight w{e). 
Furthermore we assume that G is embedded in the plane The image of G 
in is denoted by Img{G) C A face of G is a connected component of 

—Img{G). The boundary B{f) of a face / is the maximal subgraph of G whose 
image is included in the closure of the face /. The unbounded face of G is called 
the outer face, and the boundary of the outer face is called the outer boundary. 
For two graphs G = {V, E) and H = (W, F), we define GUH = {V VJW,EVJ F). 

A set of vertices in G which we wish to connect by a tree is called a net. All 
vertices in a net are called terminals. Let Af = {Ni,N2, ■ ■ ■ ,Nk} be a set of k 
nets. For the sake of simplicity, we assume that NiDNj = (p for any two different 
nets Ni, Nj G Af. Let A = |iVi| and we write Ni = {un,Ui2, ■ ■ ■ , uu^} for each i, 
1 < i < k. We assume that all h, 1 < i < k, are fixed constants, and hence there 
exists a fixed integer L such that 2 < < L for alH, 1 < i < k. 

We assume that all terminals are on the boundaries of two faces, say fi and 
/2 . One may assume that f± is the outer face and /2 is an inner face. For the 
sake of simplicity, we assume that G is 2 -connected and that B{fi) and i?(/2) 
have no common vertices or edges. 

For each net Ni G Af, a tree in G connecting all terminals in Ni is called a 
tree Ti for net Ni (in G). A weight w(Ti) of Ti is J 2 e€E(T ) w{e). A Steiner tree 
for Ni (in G) is a tree Ti for Ni in G with the minimum weight w(Ti). 

The so-called noncrossing paths may share common vertices or edges but do 
not cross each other in the plane. For example, paths Pi and P2 depicted in Fig. 
| 2 |a) cross each other on the plane, while paths Pi and P2 in Fig. | 2 |b) do not 
cross each other. 




*' 



-A 



o — 

A 




0 

^—o 



Fig. 2. (a) Crossing paths and (b) noncrossing paths. 



We then define “crossing trees.” Let G+ be a plane graph obtained from G 
as follows: add a new vertex vq in the outer face /i, and join vq and every 
terminal on B(fi); similarly, add a new vertex vj in the inner face /2, and join 
vi and every terminal on B(f2). Add to tree Ti the following edges, those joining 
Vo and terminals of Ni on B(fi) and those joining vi and terminals of Ni on 
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B{f2), and let be the resulting subgraph of G+. We say that trees Ti and 
Tj do not cross each other if any pair of a path in and a path in do 
not cross each other. In Fig.[ 3 l trees Ti and T4 do not cross each other. On the 
other hand, trees T2 and T3 cross each other since the path connecting vq and 
U22 through U21 in and the path connecting M31 and U33 through U21 in 
cross each other. The definition above is appropriate for the VLSI single-layer 
routing problem. 




Fig. 3. A crossing forest. 



A forest T for Af in G is defined to be a set {Ti, T2, • • • , Tk} such that, for 
each i, 1 < i < k, Ti is a tree for net Ni in G. We say that a forest T is 
noncrossing if every pair of trees in T do not cross each other. The forest in 
Fig.[T]is noncrossing, but the forest in Fig. 0 is crossing. The weight w{T) of a 
forest T is noncrossing Steiner forest for Af in G is a noncrossing 

forest for Af in G with the minimum total weight w(T). Our goal is to find a 
noncrossing Steiner forest for Af in a plane graph for the following two restricted 
cases: a case where all terminals lie on a single face boundary, and the other case 
where all terminals lie on two of the face boundaries. 



3 One Face Problem 

In this section, we present an efficient algorithm to solve the problem for the 
case where all terminals lie on a single face boundary B{fi) of a plane graph G. 
We call such a restricted problem the one face problem. 

One can observe that the following lemma holds. 

Lemma 1 There exists a noncrossing forest for Af if and only if there are no 
four distinct terminals uu> ,Ujj/ ,uu" and Ujj" appearing clockwise on B{fi) in 
this order, where Uii',uuii G Ni G Af and Ujji,Ujj" G Nj G Af. □ 
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Using Lemma [H one can easily determine in linear time whether there is a 
noncrossing forest for M . Thus, from now on, we may assume that there is a 
noncrossing forest for J\f . 

Our algorithm first finds k cycles Ci, C 2 , • • • , in G, and then finds k Steiner 
trees, one for each net G Af, in the inside of G^. A cycle G^, 1 < i < A:, for 
Ni is defined as follows. Let So = vi,V 2 , - ' ' be the clockwise sequence of 
all vertices on S(/i) starting from uu, where v\ = ui\. One may assume that, 
for each nets Ni G Af, the terminals un,Ui 2 , • • • , uu^ in Ni appear in So in this 
order, and that the first terminals un,U 2 i, ■ ■ ■ ,Uki in all nets appear in So in 
this order, as illustrated in Fig. |4] for the case A: = 4. For each i, 1 < i < k, there 




Fig. 4. Illustration of a noncrossing forest. 



exists a set Vi of li paths Pi, P 2 , - ■ ■ , Pu such that each 1 < i' < k, is a 
shortest path connecting uu' and in G, and any two paths in Vi do not 

cross each other, where = un. The cycle Ci obtained by concatenating all 

the k paths in Vi is called a cycle Ci for Ni. We denote by G{Ci) the subgraphs 
of G inside Img{Ci). We say that cycles Ci and Cj do not cross each other if 
any pair of a path in Vi and a path in Vj do not cross each other. A set of 
noncrossing cycles for J\f is defined to be a set C = {Gi, G2, • • • , Gfc} such that 
each Ci is a cycle for W and any two cycles in C do not cross each other. Fig. 
El a) illustrates a set of noncrossing cycles. Then the following lemma holds. 



Lemma 2 There exists a set C of noncrossing cycles for Af if there exists a 
noncrossing forest T for Af. □ 

Finding a set C = {Gi, G 2 , • • • , G^} of noncrossing cycles would require time 
I7(n^), because vertices in G may be shared by many distinct cycles and hence 
the sum J2i=i l^(Ci)l is not always bounded by 0{n). Remember that k is not 
always a fixed number but k = 0(n). Therefore, we find a subgraph Cc = |^ Ci 

CiGC 

of G instead of a set C of noncrossing cycles. Clearly, Gc has at most n vertices. 
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Similarly, we find a subgraph Gj- = Ti of G instead of a noncrossing Steiner 

Tier 

forest T for Af. From Gj- one can find in time 0{\V{Ti)\) for each i, 1 < i < k. 

We are now ready to present an algorithm Forestl to solve the one face 
problem. 

Algorithm Forestl 

step 1. Find Gc, where C is a set of noncrossing cycles for Af; 

step 2. For each cycle Gi G C, find a Steiner tree Ti for Ni in the plane subgraph 

G(Q); 

step 3. Construct a graph Gj- = Ui=i 

We first consider how to execute step 1. For each i,l < i < k, let A/J be a set of 
li two-terminals nets defined as follows: J\f' = {{un,Ui2}, {ui2, u^}, ■ ■ ■ , {uu^un}} 
Let M' = U^=i Then M' consists of two-terminals nets. In Fig. Elb), 

each two-terminals net in Af' are drawn by a pair of a white point and a black 
point. Since all terminals in J\f' are on a single face boundary using the 



« 2l U22 ^21 U22 





Fig. 5. Illustration for step 1. 



algorithm in [5j, one can find the minimum noncrossing paths for Af' in time 
0(n log n). Furthermore, it is shown in |S] that each of the paths is a shortest 
path connecting the pair of terminals in a net. Therefore, for each i, 1 < t < fc, 
one can obtain a cycle Gi by concatenating the k paths for the nets in A/J. 
Thus, the minimum noncrossing paths for J\f' immediately yield the subgraph 

Gc = UtiC- 

We next consider how to execute steps 2 and 3. Let be the number of the 
edges in G{Gi). Using the algorithm in |1I3| . for each net Ni G Af, one can find 
a Steiner tree Ti for Ni in time 0{rrii log rrii) since |W| = 0(1) and all terminals 
in Ni lies on the outer boundary of G{Gi). Therefore, one can find a noncrossing 
Steiner forest 'T for Af in time rrii log rrii). Thus one can find Gj- in time 

0(n log n) although the detail is omitted in this extended abstract. 

The correctness of our algorithm is based on the following lemma, the proof 
of which is omitted in this extended abstract. 
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Lemma 3 For any set C = {Ci, C2, • • • , C^} of noncrossing cycles, a Steiner 
tree Ti for Ni in G{Ci) is a Steiner tree Ti for Ni in G for each i, 1 < i < k. □ 

Thus we have the following Theorem [U 

Theorem 1 If all terminals lie on a single face boundary in a plane graph, then 
one can find a noncrossing Steiner forest T in time O(nlogn). □ 

4 Two Faces Problem 

In this section, we give an algorithm to find a noncrossing Steiner forest for M 
for the case where all terminals lie on two of the face boundaries, B{fi) and 
B{f2)- We call such a restricted problem the two faces problem. 

We divide M into the following three subsets Nout, J^in and Mio' 

1 . for each net Ni S Mout, all terminals of Ni lie on the outer face boundary 

B{fi); 

2. for each net Ni G Min, all terminals of W lie on the inner face boundary 

B{f2); and 

3. for each net Ni G Afio, Ni contains a terminal on -B(/i) and a terminal on 

B{f 2 ). 

Obviously, these three sets are disjoint and Af = Afout 0 Afm U Afio- One may 
assume Afio ^ 4 >'- otherwise, one can easily find a noncrossing Steiner forest 
for Af', find a noncrossing Steiner forest Tout for Afout by executing Forest 1 
for Afout, all terminals of which are on then find a noncrossing Steiner 

forest Tin for Afm by executing Forest 1 for Afm, all terminals of which are 
on B(/2); and finally let T = Tut 0 Tn- For the sake of simplicity, we assume 
\Afio\ > 2. Let Afio = {Ni,N 2, ■ ■ ■ , Nm}, where m = \Afio\- For each net Ni S Afio, 
let N^ = Ni (1 V{B{fi)), and N^ = Ni (1 V{B{f2)). One may assume that 
= {u^l,Ui2,■■■,Uir} and IV/ = {ui,^r,+i),u,(^p_+2), - ■ ■ where k = 

I'i + l'l- One may assume that terminals un, M12, • • • , uiz' , M21, ^22, • • • , rt2/^, • • • , 
Umi, Um2, ' ' ' , Umi'^ appear clockwise on B{fi) in this order, and that 

‘ ‘ 7 ’^2(/2 + 1) ? ’^2(/2+2) 7 ‘ ‘ * 7 '^212 7***7 + 7 ’^m{/^+2) 7***7 '^mlm 

appear clockwise on B{f2) in this order, as illustrated in Fig.E] Otherwise, there 
is no noncrossing forest. Let s = Un, and let t = 

The main idea of our algorithm is to reduce the two faces problem for G 
to the one face problem for three graphs. For a path P between s and t, a slit 
graph SG{P) of G for P is generated from G by slitting apart path P into 
two paths Pi and Pr, duplicating the vertices and edges of P as follows (See 
Fig. 17}. Each vertex r; in P is replaced by new vertices vi and Vr- Each edge 
(v,v') in P is replaced by two parallel edges {vi,v[) in Pi and (vr,v^) in P^. 
Any edge (v, w) that is not in P but is incident to a vertex f in P is replaced 
by (vi,w) if (v,w) is to the left of a path P going from s to t, and by (vr,w) 
if {v, w) is to the right of the path. The operation above is called slitting G 
along P. A graph and its slit graph are illustrated in Fig. |7] The net Ni G Afio 
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Fig. 6. A noncrossing forest. 





Fig. 7. Illustration for a graph G and its slit graph SG{P). 



contains s = un and t = We replace net Ni € JV for G by a new net 

N[ = (iVi - {s,t}) U for SG{P). 

Let P* be any shortest path connecting s and t in G. We denote by P^ a 
path in G corresponding to a shortest path connecting S[ and tr in SG{P*). (In 
Fig. (Ha) the path is drawn by a thick line.) Similarly, we denote by P~ a path 
in G corresponding to a shortest path connecting Sr and ti in SG{P*). (In Fig. 
IHIa) the path is drawn by a dotted line.) 

We are now ready to present an algorithm Forest2 to solve the two faces 
problem. 

Algorithm Forest2 

step 1. Find a shortest path P* between s and t in G; 

step 2. Construct a slit graph SG{P*) of G, and find and P~; {Fig. |El)a)} 
step 3. Construct slit graphs S'G(P+) and SG{P~) of G; (Figs.[8j(b) and (c)} 
step 4. Using Forestl, find noncrossing Steiner forests T* in SG{P*), T'*' in 
SG{P+), and T~ in SG{P~); 

step 5. Output one of these three forests having the minimum total weight. 

One can find P* either in time 0(n log n) by an ordinary Dijkstra algorithm 
or in time 0{n) by a sophisticated shortest path algorithm in [5I10J . Thus one 
can execute steps 1-3 in time O(nlogn). 




Fig. 8. Illustration for SG{P*), SG{P+), SG{p-). 



Since all terminals in SG{P*) lie on the outer boundary of SG{P*), one 
can find in SG{P*) in time O(nlogn) using Forestl. Similarly, one can 
find G-J-+ and G 7 -- in time O(nlogn). Thus, one can execute step 4 in time 
0 (n log n). 

Obviously, one can execute step 5 in a constant time. 

The correctness of our algorithm is based on the following lemma, the proof 
of which is omitted in this extended abstract. 

Lemma 4 There exists a noncrossing Steiner forest T such that either P*,P~^ 
or P~ crosses none of the trees in T . □ 

Thus we have the following Theorem |2l 

Theorem 2 If all terminals lies on at most two face boundaries in a plane 
graph, then one can find a noncrossing Steiner forest in time 0(n log n). □ 



5 Conclusion 

In this paper, we present an efficient algorithm for finding a noncrossing Steiner 
forest in a plane graph for the case all terminals are located on at most two 
of the face boundaries. Our algorithm takes O(nlogn) time and 0{n) space if 
there exists a constant upper bound L for the number of terminals in each net. 
It takes time 0{L^n + log n) if the maximum number L of terminals in all 
nets is not always a fixed constant. 

Slightly modifying our algorithm, one can find an “optimal” noncrossing fo- 
rest. Let T = {Ti, T 2 , ■ • • , Tk} be a noncrossing forest for M . Denote the weight 
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w{Ti) of Ti simply by Wi. Let f{w\,W2, ■ ■ ■ , Wk) be an arbitrary (objective) fun- 
ction which is nondecreasing with respect to each variable Wi. We call a non- 
crossing forest T = {Ti,T 2, ■ ■ ■ ,Tk} minimizing f(wi,W2,---,Wk) an optimal 
noncrossing forest (with respect to the objective function /). 

EXAMPLE 1. The noncrossing Steiner forest is an optimal noncrossing forest 
minimizing the objective function / = Clearly, / is nondecreasing 

with respect to each Wi. 

EXAMPLE 2. If the wires for all nets have the same width, then a noncrossing 
Steiner forest correspond to a routing minimizing the area required by wires. 
On the other hand, if the wires have various widths, say width ai for net Ni, 
then the optimal noncrossing forest minimizing / = cXiWi corresponds to 
a routing minimizing the area. This function / is also nondecreasing with 
respect to Wi if all at are not negative. 

One of future works is to obtain an algorithm to solve the noncrossing Stei- 
ner forest problem in a general case where terminals lie on three or more face 
boundaries. 
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Abstract. A total coloring of a graph G is a coloring of all elements of 
G, i.e. vertices and edges, in such a way that no two adjacent or incident 
elements receive the same color. The total coloring problem is to find 
a total coloring of a given graph with the minimum number of colors. 
Many combinatorial problems can be efficiently solved for partial fc-trees, 
i.e., graphs with bounded tree- width. However, no efficient algorithm has 
been known for the total coloring problem on partial fc-trees although a 
polynomial-time algorithm of very high order has been known. In this 
paper, we give a linear-time algorithm for the total coloring problem on 
partial fc-trees with bounded fc. 



1 Introduction 

A total coloring is a mixture of ordinary vertex-coloring and edge-coloring. That 
is, a total coloring of a graph G is an assignment of colors to its vertices and 
edges so that no two adjacent vertices have the same color, no two adjacent ed- 
ges have the same color, and no edge has the same color as one of its ends. The 
minimum number of colors required for a total coloring of a graph G is called the 
total chromatic number of G, and denoted by Xt{G)- Figure [Uillustrates a total 
coloring of a graph G using Xt{G) = 4 colors. This paper deals with the total 
coloring problem which asks to find a total coloring of a given graph G using 
the minimum number Xt{G) of colors. Since the problem is NP-complete for 
general graphs |San89j . it is very unlikely that there exists an efficient algorithm 
for solving the problem for general graphs. On the other hand, many combi- 
natorial problems including the vertex-coloring problem and the edge-coloring 
problem can be solved for partial fc-trees with bounded fc very efficiently, mostly 
in linear time [ACPS93IAL91IBPT92ICou90ICM93IZSN96] . However, no efficient 
algorithm has been known for the total coloring problem on partial fc-trees. Alt- 
hough the total coloring problem can be solved in polynomial time for partial 
fc-trees by a dynamic programming algorithm, the time complexity 0(n^ ^ ^ 
is very high [IIZN99] . 
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color c\ 
color C2 
color C3 
color C4 



Fig. 1. A total coloring of a graph with four colors. 



In this paper, we give a linear-time algorithm to solve the total coloring 
problem for partial fc-trees with bounded k. The outline of the algorithm is as 
follows. For a given partial k-tree G = {V, E), we first find an appropriate subset 
F C E inducing a forest of G, then find a “generalized coloring” of G for F and 
an ordinary edge-coloring of the subgraph H = G[F] oi G induced hy F = E—F, 
and finally superimpose the edge-coloring on the generalized coloring to obtain 
a total coloring of G. The generalized coloring is an extended version of a total 
coloring and an ordinary vertex-coloring, and is newly introduced in this paper. 
Since F induces a forest of G, a generalized coloring of G for F can be found 
in linear time. Since Ff is a partial F-tree, an edge-coloring of iF can be found 
in linear time. Hence the total running time of our algorithm is linear. Thus 
our algorithm is completely different from an ordinary dynamic programming 
approach. 

The paper is organized as follows. In Section 2, we give some basic definitions. 
In Section 3, we give a linear algorithm for finding a total coloring of a partial 
F-tree, and verify the correctness of the algorithm. 



2 Terminologies and Definitions 

In this section we give some basic terminologies and definitions. 

For two sets A and B, we denote hy A — B the set of elements a such that 
a G A and a ^ B. 

We denote by G = {V, E) a simple undirected graph with a vertex set V and 
an edge set E. For a graph G = (V, E) we often write V = V (G) and E = E{G). 
We denote by n the cardinality of V{G). We denote by X^(G) the minimum 
number of colors required for an ordinary edge-coloring of G, and call X^(G) the 
chromatic index ofG. 

For a, set F C E and a vertex v G V, we write dp{v,G) = |{(u,ru) G 
F : w G V}\ and Ap{G) = max{di?(z;, G) : v G V}. In particular, we call 
d{v,G) = dE{v,G) the degree of v, and A(G) = Ae(G) the maximum degree of 

G. 
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Let F he a, subset of E, called a colored edge set, and let C be a set of colors. 
A generalized coloring of a graph G for F is a mapping f : VU F ^ C satisfying 
the following three conditions: 

(1) the restriction of the mapping / to F is a vertex-coloring of G, that is, 
f{v) ^ f(w) for any pair of adjacent vertices v and w in G; 

(2) the restriction of the mapping / to F is an edge-coloring of the subgraph 
G[F] of G induced by F, that is, /(e) yf f(e') for any pair of edges e,e' G F 
sharing a common end; and 

(3) f{v) /(e) for any pair of a vertex v G V and an edge e G F incident to v. 




Fig. 2. A generalized coloring of a graph with three colors. 



Note that the edges in F = E — F are not colored by the generalized coloring 
/. We call the edges in F colored edges and the edges in F uncolored edges. A 
total coloring of G is a generalized coloring for a colored edge set F = E, while 
a vertex-coloring is a generalized coloring for a colored edge set F = 0. Thus 
a generalized coloring is an extension of a total coloring and a vertex-coloring. 
It should be noted that a generalized coloring of G for F is a total coloring of 
G[F] but a total coloring of G[F] is not always a generalized coloring of G for 

F. The minimum number of colors required for a generalized coloring of G for 
F is called the generalized chromatic number ofG, and is denoted by Xt{G,F). 
In particular, we denote Xt{G,E) by Xt(G), and call Xt{G) the total chromatic 
number of a graph G. Clearly xt{G,F) > Ap{G) + 1 and Xt(G) > A(G) -I- 1. 
Figure El depicts a generalized coloring of a graph G using Xt{G, F) = 3 colors for 
the colored edge set F = {(ui, U 2 ), (ua, U 5 ), (ua, uy), (u 4 , ue), (us, ue)}, where the 
uncolored edges (^ 1 ,^ 4 ), {v 2 ,V’j), {v^,Vq) and (^ 4 ,^ 5 ) in F are drawn by dotted 
lines. 

Suppose that g is a generalized coloring of G for F, h is an ordinary edge- 
coloring of the subgraph FI = G[F] of G induced by F, and g and h use disjoint 
sets of colors. Then, superimposing h on g, one can obtain a total coloring / of 

G. Unfortunately, the total coloring / obtained in this way may use more than 
Xt{G) colors even if g uses Xt{G,F) colors and h uses x'{H) colors, because 
Xt{G) < Xt{G,F) + x'{H) but the equality does not always hold; for example, 
Xt(G) = 4, Xt{G,F) = 3 and x'(iL) = 2, and hence Xt{G) < Xt{G, F) + x' {H) for 
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the graph G in Figure [2l However, in Section 3, we will show that, for a partial 
fc-tree G = (V,E) with the large maximum degree, there indeed exists F C E 
such that Xt{G) = Xt{G, F) + x'{F), and show that such a set F, a generalized 
coloring of G for F and an edge-coloring of H can be found in linear time. 

A graph G = (F, E) is defined to be a k-tree if either it is a complete graph 
of k vertices or it has a vertex v G V whose neighbors induce a clique of size k 
and the graph G — {u} obtained from G by deleting the vertex v and all edges 
incident to v is again a fc-tree. A graph is defined to be a partial k-tree if it is a 
subgraph of a fc-tree |Bod90| . In the paper we assume that fc = 0(1). The graph 
in Figure [1] is a partial 3-tree. 

For a natural number s, an s-numbering of a graph G = (V, E) is a bijection 
(fi : V —>■ {1,2, - ■ ■ ,n} such that |{(u,a;) £ E : Lp{v) < </3(a;)}| < s for each vertex 
V G V . K graph having an s-numbering is called an s-degenerated graph. Every 
partial fc-tree G is a fc-degenerated graph, and its fc-numbering can be found in 
linear time. 

For an s-numbering ip oi G and a vertex v G V, we define 

e!^{v,G) = {{v,x) G E : g,{v) < 
eI'^{v,G) = {(a;,-(;) £ E : (p(x) < <p(v)j; 

4-(n,G) = |A^-(n,G)|; and 
4-(u,G) = |Abw(^,G)|. 

The edges in E^ are called forward edges, and those in E^'” backward edges. The 
definition of an s-numbering implies that df^{v, G) < s for each vertex v G V. 

3 A Linear Algorithm 

In this section we prove the following main theorem. 

Theorem 1. Let G = {V,E) be a partial k-tree with bounded fc. Then there 
exists an algorithm to find a total coloring of G with the minimum number Xt{G) 
of colors in linear time. 

We first have the following lemma |ZNN96IIZN99| . 

Lemma 1. For any s-degenerated graph G, the following (a) and (b) hold: 

(a) if A{G) > 2s, then x'{G) = A(G); and 

(b) Xt(G) < A(G) -I- s -I- 2. 

Using a standard dynamic programming algorithm in |IZN99] , one can solve 
the total coloring problem for a partial fc-tree G in time 0{nxt ^ where 
Xt = Xt{G); the size of a dynamic programming table updated by the algorithm 
is Since G is a partial fc-tree, G is fc-degenerated. Furthermore fc = 

0(1). Therefore, if A(G) = 0(1), then x*(G) = 0(1) by Lemma [Ub) and hence 
the algorithm takes linear time to solve the total coloring problem. Thus it 
suffices to give an algorithm for the case A(G) is large, say A(G) > 8fc^. 
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Our idea is to find a subset F oi E such that Xt{G) = Xt{G,F) + x'{H) as 
described in the following lemma. 

Lemma 2. Assume that G = (V,E) is an s-degenerated graph and has an s- 
numbering (p. If A{G) > 8s^, then there exists a subset F of E satisfying the 
following conditions (a)-(h): 

(a) A{G)^ Af{G) + Ap{G), where F = E- F- 

(b) Z\j.(G)>s+l; 

(c) Ap{G) > 2s; 

(d) the set F can be found in linear time] 

(e) ip is a 1-numbering of G' = (V,F), and hence G' is a forest ; 

(f) Xt{G,F) = Ap{G) + 1, and a generalized coloring of G for F using 
Ap{G) + 1 colors can be found in linear time] 

(g) x'{H) = Ap{G), where H = (V, F)] and 

(h) XtiG) = XtiG,F)+x'{H). 

Proof. The proofs of (a)-(e) will be given later. We now prove only (f)-(h). 

(f) Let G be a set of Ap{G) + 1 colors. For each i = 1, 2, • • • , n, let Vi be a 
vertex of G such that p{vi) = i, let = {x € V : (vi,x) G E, p{vi) < 

p{x)}, and let E^p{vi) = {{vi,x) G E : p{vi) < p{x)}. Since p is an s-numbering 
of G, df^{vi,G) = < s for each i = l,2,---,n. By (e) p is a 1- 

numbering of G' = (V,E), and hence df^{vi,G') = \E^p (vi)\ < 1 for each i = 
l,2,---,n. 

We construct a generalized coloring g ol G for F using colors in G as follows. 
We first color by any color c in G: let g{vn) '■= c. Suppose that we have 
colored the vertices Vn-i, • • • , Wz+i and the edges in E^p {vn-i)A E^p {vn- 2 ) U 
• • • U E^p{vi+i), and that we are now going to color Vi and the edge in E^p{vi) 
if E^p{vi) yf 0. There are two cases to consider. 

Case 1: E^^{vi) yf 0. 

In this case E^p{vi) contains exactly one edge e = (vi,Vj), where i < j < n. 
We first color e. Let G' = {gi{vj,vi)) : (vj,vi) G F, i-\-l<l<n}C C, then 
we must assign to e a color not in {g{vj)} U G'. Since e = (vj,Vi) G F, we have 

\{{vj,Vi)} U {(vj,vi) G F : i 1 < I < n}\ < d{vj,G') 

and hence |G'| < d{vj,G') — 1. Clearly d{vj,G') < Ap{G) = |G| — 1. Therefore 
we have |G'| < |G| — 2. Thus there exists a color c' G G not in {g{vj)}\JG' . We 
color e by c': let g{e) := d . 

We next color Wj. Let G" = {g{x) : x G then we must assign to Vi 

a color not in {c'} U G". Since |G"| < < s and Ap{G) > s + 1 by (b) 

above, we have |{c'} UG"| < s + 1 < Ap{G) = \G\ — 1. Thus there exists a color 
c" G G not in {c'} U G", and we can color Vi by c": let g{vi) := c" . 

Case 2: E^^{vi) = 0. 

In this case we need to color only Vi. Similarly as above, there exists a color 
c" G G not in G" since |G"| < s < Ap{G) < \G\. Therefore we can color Vi by 
c": let g{vi) := c" . 
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Thus we have colored Vi and the edge in E^p{vi) if E^p{vi) ^ 0. Repeating 
the operation above for i = n — l,n — we can construct a generalized 

coloring g of G for E using colors in C. Hence Xt{G,E) < \C\ = Ap{G) + 1. 
Clearly xt(G,F) > Ap{G) + 1, and hence we have Xt{G,F) = Ap{G) + 1. 
Clearly the construction of g above takes linear time. Thus we have proved (f). 

(g) Since G is s-degenerated, the subgraph iJ of G is s-degenerated. By (c) 
we have A{H) = Ap{G) > 2s. Therefore by Lemma [TJa) we have x!{H) = 
A{H) — Ap{G). Thus we have proved (g). 

(h) We can obtain a total coloring of G by superimposing an edge-coloring of 
H on a generalized coloring of G for F. Therefore we have Xt{G) < Xt{G,F) + 
x'{H). Since Xt{G) > A(G) + 1, by (a), (f) and (g) we have 

Xt(G) > A(G) + 1 

= Ap(G) + Ap(G) + l 
= Xt(G,F) + x'(H). 

Thus we have Xt(G) = Xt(G, F) + x'(H). 

We now have the following theorem. 

Theorem 2. If G is an s-degenerated graph and A{G) > 8s^, then Xt{G) = 
A{G) + l. 

Proof. By (a), (f), (g) and (h) in Lemma[2]we have 

Xt{G)=xt{G,F) + x'{H) 

— Ap(G) -l- 1 -l- Ap(G) 

= A{G) + 1. 



We are now ready to present our algorithm to find a total coloring of a given 
partial k-tree G = (V, E) with A{G) > 8/c^. 

[Total- Coloring Algorithm] 

Step 1. Find a subset F f- E satisfying Conditions (a)-(h) in Lemma 
Step 2. Find a generalized coloring g oiG for F using Xt(G, F) = Z\i;’(G)-|- 
1 colors. 

Step 3. Find an ordinary edge-coloring h of H using x'(H) = Ap{G) co- 
lors. 

Step 4. Superimpose the edge-coloring h on the generalized coloring g to 
obtain a total coloring f of G using Xt{G) — A{G) 1 colors. 
Since G is a partial fc-tree, G is /c-degenerated. Since A{G) > 8fc^, by 
Lemma [21(d) one can find the subset F C E in Step 1 in linear time. By 
Lemma 121(f) one can find the generalized coloring g in Step 2 in linear time. 
Since G is a partial k-tree, a subgraph iJ of G is also a partial fc-tree. Therefore, 
in Step 3 one can find the edge-coloring h of H in linear time by an algorithm 
in [ZNN96j although x'{H) is not always bounded. Thus the algorithm runs in 
linear time. 
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This completes the proof of Theorem [TJ 

In the remainder of this section we prove (a)-(e) of Lemma |2] We need the 
following two lemmas. 

Lemma 3. Let G=(V,E) be an s-degenerated graph, and let ip be an s-numbering 
of G. If A{G) is even, then there exists a subset E' of E satisfying the following 
three eonditions (a)-(c): 

(a) A{G') = A(G") = A(G)/2; 

(b) p is an \s/2~\ -numbering ofG'; and 

(c) \E'\ < \E\/2, 

where G' = {V,E') and G” = {V,E — E'). Furthermore, such a set E' can be 
found in linear time. 

Proof. Omitted in this extended abstract due to the page limitation. 

Lemma 4. Let G = (V, E) be an s-degenerated graph, and let a be a natural 
number. If A{G) > 2a > 2s, then there exists a subset E' of E such that 
Ae'{G) + Ae,{G) = A{G) and Ae'{G) = a where E' = E — E' . Furthermore, 
such a set E' can be found in linear time. 

Proof. Since G is an s-degenerated graph and A{G) > 2a > 2s, there exists a 
partition {Ei,E 2 , ■ ■ ■ ,Ei} oi E satisfying the following three conditions (i)-(iii) 
[ZNN96, pp. 610]: 

(i) Eh^ESG) = A{Gy, 

(ii) Ae^ (G) = a for each i = 1,2, ■ ■ ■ ,l — 1; and 
(hi) a < Ae,{G) < 2a. 

Let E' = El . Then Ae'{G) — Ae^{G) = a and by (ii), clearly 

Ae,{G)>A((G)-Ae\G). (1) 

On the other hand, since E' = E 2 ^ E^ VJ ■ ■ ■ \J Ei, hy (i) we have 

i 

Ae,{G) <Y^AeXG) 

i^2 

= A{G)-AeAG) 

= A{G)-Ae^{G). (2) 

ThusbyEqs. (II| and ([2]) we have Ae>{G) = A{G) — Ae'{G), and hence Ae'{G)-\- 
Ae'{G) = A{G). Since the partition {Ei,E 2 ,- ■ ■ , Ei\ of E can be found in linear 
time [ZNN96j . the set E' = Ei can be found in linear time. 

We are now ready to give the remaining proof of Lemma HI 
Remaining Proof of Lemma H] Since we have already proved (f)-(h) before, 
we now prove (a)-(e). 

Let p = [logZ\(G)J. Then 2^ < A{G) < 2^+^. Since A{G) > 8s^, we have 
p = [logZ\(G)J >3-1- [2 logs] >2-1- 2 logs. Therefore we have A{G) > 2p > 

22+2 logs ^ 4^2 > 2s. 



354 



S. Isobe, X. Zhou, and T. Nishizeki 



Let q = [log s] . Then 2"? ^ < s < 2“?. We find F by constructing a sequence 
of g + 1 spanning subgraphs Go, Gi, • • • , G, of G as follows. 

1 procedure FIND-F 

2 begin 

3 by Lemma m find a subset Eq of E such that 

(3-1) Aeo{G) = 2p-^; and 

(3-2) Aeo(G) + Ae^{G) = A{G), where Eo = E - Gq; 

{Choose a = 2^~^, then A{G) > 2^ = 2a > 2s, and hence 
there exists such a set Eq by Lemma IH 

4 let Go := (V,Eq) and let Sq := s; 

|Z\(Go) = 2P~^ and ip is an so-numbering of Go} 

5 for i := 0 to g — 1 do 

6 begin 

7 by Lemma [3l find a subset E'^ of Ei satisfying 

(7-1) Z\(G') = A{G”) = A{Gi)/2- |Z\(G,) is even} 

(7-2) ip is an s^+i-numbering of G', where s^+i = [si/2]; 
and 

(7-3) lif'l < |G,|/2, 

where G' = (F, G'), G" = (F, E”) and E” = E, - E'p, 

8 let Ei+i := if' and let G^+i := (F, ifi+i); 

9 end 

10 let F := Eg] 

11 end. 

We first prove (a). Since F = Eg, Ap{G) = Ae^{G) = A{Gg). Therefore we 
have 

Z\^(G) >Z\(G)-Z1f(G) = Z\(G)-z1(G,). (3) 

By line 7 and line 8 in the procedure above, we have G^+i = G', Z\(Gi+i) = 
Z\(G') and A{G'i) + Z\(G") = Z1(G,), and hence Z\(G") = Z\(G,) - Z\(G,+i) for 
each i = 0, 1, • • • , g — 1. Therefore we have 

|]zi(G") = E(^(G,)-Z\(G,+i)) 

= A{G,)-A{Gg). (4) 

Since A{Go) = Aeo(G), by (3-2) in the procedure above we have 

Zl(Go) + Z\^„(G) = Zl(G). 



( 5 ) 
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Furthermore, since F = E — F = Eq\JEqVJE”VJ- ■ •U£'"_i and Ae"{G) = AiG”) 
for each f = 0, 1, • • • , g — 1, by Eqs. 0) and © we have 

9-1 

Ap{G)<Ae^{G) + Y^Ae'^{G) 

= Ae,{G) + "^A{G':) 

= ^Eo(^) + ^(Go) - ^(Gq) 

= A(G)-A(Gq). (6) 

Therefore by Eqs. dS]) and ® we have Ap{G) = A{G) — A{Gq). Since A{Gq) = 
Af{G), we have Ap{G) = A{G) - Af{G), and hence Af{G) + Ap{G) = A{G). 
Thus we have proved (a). 

We next prove (b). By (7-1) and line 8 in the procedure above we have 
Z\(Gi+i) = A{G\) = A{Gi)/2 for each z = 0, 1, • • • , g — 1, and by (3-1) we have 
A{Go) = 2P~^. Therefore we have 

Af{G) = A{Gq) = (7) 

Since 2p > 4s^ and < s, we have Af{G) = > 4s^/4s = s. Thus we 

have Af{G) > s -I- 1, and hence (b) holds. 

We next prove (c). By (a) and Eq. (|7j we have Ap{G) = A{G) — Af{G) = 
A{G) — and hence 



Ap(G) = A(G) - 
> A(G) - 



2P 

29+1 

A(G) 



29+1 

= AG)(1-Tt) 



29 + 1 ' 



= 4s(2s- 1) 
> 2s 



since A(G') > 8s^, A(G) > 2^ and s < 29. Thus we have proved (c). 

We next prove (d). By LemmalU line 3 can be done in time 0(|E|). By 
Lemma O line 7 can be done in time 0{\Ei\). Therefore the for statement in 
line 5 can be done in time 0(^®~g \Ei\) time. Since |i?o| < \E\ and |Ei_|_i| = 
\E^\ < \Ei\/2 for each i = 0, 1, • • • , g — 1 by (7-3) in the procedure above, we 
have X]i=o 1^*1 — ‘^\E\- Thus one can know that the for statement can be done 
in time 0(|i?|). Thus F can be found in linear time, and hence (d) holds. 
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We finally prove (e). Since s = sq <2'^ and Si = |"si_i/2] < Si-\j2 + 1/2 for 
each i = 1, 2, • • • , g, we have 



1 



, So 

So < 1 

^ - 29 2 



1 

1 



= ^ + l-- 
29 29 

< 2 , 



1 



and hence = 1. Therefore + is a 1-numbering of G' — {V,F). Thus we have 
proved (e). 

This completes the proof of Lemma |2] 
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Abstract. The topology-oriented approach is a principle for translating 
geometric algorithms into practically valid computer software. In this 
principle, the highest priority is placed on the topological consistency of 
the geometric objects; numerical values are used as lower-priority infor- 
mation. The resulting software is completely robust in the sense that no 
matter how large numerical errors arise, the algorithm never fail. The 
basic idea of this approach and various examples are surveyed. 



1 Introduction 



There is a great gap between theoretically correct geometric algorithms and 
practically valid computer programs. This is mainly because the correctness of 
the algorithms is based on the assumption that numerical computation is done 
in infinite precision, whereas in actual computers computation is done only in 
finite precision. Numerical errors often generate topological inconsistency, and 
thus make the computer programs to fail. 

To overcome this difficulty, many approaches have been proposed. They can 
be classified according to how much they rely on numerical values. 

The first group of approaches relies on numerical values absolutely. The to- 
pological structure of a geometric object is determined by the signs of the results 
of numerical computations. If we restrict the precision of the input data, these 
signs can be judged correctly in a sufficiently high but still finite precision. Using 
this principle, the topological structures are judged correctly as if the compu- 
tation is done exactly. These approaches are called exact-arithmetic approaches 
[If |,3fhl1 4pi 511 7f21 r2,3|2fi| . In these approaches, we need not worry about misjudge- 
ment and hence theoretical algorithms can be implemented in a rather straight- 
forward manner. However, the computation is expensive because multiple preci- 
sion is required, and furthermore exceptional branches of processing for degene- 
rate cases are necessary because all the degenerate cases are recognized exactly. 

The second group of approaches relies on numerical values moderately. Every 
time numerical computation is done, the upper bound of the error is also evalua- 
ted. On the basis of this error bound, the result of computation is judged to be 
either reliable or unreliable, and the only reliable result is used [2l4l6llL)H6j . Ho- 
wever, these approaches make program codes unnecessarily complicated because 
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every numerical computation should be followed by two alternative branches of 
processing, one for the reliable case and the other for the unreliable case. Moreo- 
ver, they decrease the portability of the software products, because the amount 
of errors depends on computation environment. 

The third group of approaches do not rely on numerical values at all. In 
this group of approaches, we start with the assumption that every numerical 
computation contains errors and that the amount of the error is unknown. We 
place the highest priority on the consistency of topological properties, and use 
numerical results only when they are consistent with the topological properties, 
thus avoiding inconsistency |7|S|1 1 11 2|1 ,3nhj20l‘24|‘25] . These approaches are cal- 
led topology-oriented approaches. 

In this paper, we concentrate on the third group of approaches, and survey 
its basic idea and various applications. 



2 Basic Idea 

Let A be an algorithm for solving geometric problem P. In general the algorithm 
A contains many crotches at which the flow of the procedure diverges to two or 
more alternative branches. Each path from the starting point in this branching 
structure corresponds to a behavior of the algorithm A. 

Once we fix an instance of the problem, all of such paths can be classified 
into three groups. The first group of paths leads to the correct solution of the 
problem instance; in many algorithms this group contains only a single path. 
The second group of paths leads to the solutions of other instances belonging 
to the problem P. The third group of paths corresponds to inconsistency. Note 
that even though the algorithm A is correct, there are the second and the third 
groups of paths. The correctness of the algorithm is guaranteed only in the sense 
that these groups of paths will never be selected in precise arithmetic. 

Let us consider a simple example. Let Pi, P 2 , P 3 , P 4 be a cyclic sequence of 
vertices forming a convex quadrangle in the three-dimensional space, and P{ be 
a non- vertical plane that does not contain any of these vertices. Suppose that we 
want to cut the quadrangle by the plane H. How the quadrangle is cut depends 
on which side of H each vertex lies. Hence, a natural algorithm to solve this 
problem will have two-branch crotches, each of which corresponds to whether 
the vertex P^ is above H or below H, Hence, the flow of the procedure forms a 
binary-tree structure as shown in Fig. [H where each non-leaf node corresponds 
to the judgement on whether P^ is above H, and the left branch and the right 
branch correspond to the affirmative case and the negative case, respectively. 

As seen in Fig. [T] the procedure contains 2^^ = 16 paths from the root to 
the leaves. Among them, the two paths represented by bold lines correspond 
to inconsistent cases; in these cases all the adjacent vertices are on mutually 
opposite sides of H, which never happens in Euclidean geometry because two 
distinct planes in the three-dimensional space can admit at most one line of 
intersection. Thus we get the next requirement. 
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Fig. 1. Tree structure of the flow of the procedure. 



TR 2.1. If mutually diagonal two vertices belong to the same side of H, at 
least one of the other vertices should also belong to the same side. 

TR is an abbreviation of “Topological Requirement” . We call the above require- 
ment a “topological requirement” because it can be stated in purely topological 
(i.e., nonmetric) terms. 

Geometric algorithms usually contain inconsistent paths, just as the bold 
paths in Fig. [I] Nevertheless, these algorithms are theoretically correct because in 
the theoretical world numerical judgement is always correct and hence will never 
choose such an inconsistent path. In computers, on the other hand, branches are 
chosen on the basis of finite-precision computation and hence inconsistent paths 
are sometimes chosen. 

The left bold path in Fig. [1] says that Pi is above H, P 2 is below H and P 3 
is above H. Then, the topological requirement TR 2.1 implies that P 4 should 
be above H . In this sense the fourth judgement is redundant. Such redundant 
judgements cause inconsistent branching. 

The tree structure shown in Fig. [2lis obtained from Fig. Hlby removing such 
redundant judgements. In this structure all the paths belong to either the first 
group or the second group, and hence, no matter how poor precision is used in 
computation, the algorithm never come across inconsistency. 

In the topology-oriented approach, we first collect topological requirements 
that should be satisfied by the geometric objects, next find redundant judge- 
ments, and finally remove them from the tree structure of the flow of processing, 
just as we rewrite the structure in Fig. Q] to that in Fig. |2] What we want to 
emphasize here is that the topological requirements can be stated in a purely 
topological terms, which means that these requirements can be checked by logi- 
cal (non-numerical) computation only, and hence can be done always correctly. 
The resulting procedure contains only “possible” paths (i.e., paths belonging to 
the first and the second groups), and consequently is free from inconsistency. 
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Fig. 2. Tree structure of the flow of the procedure formed by only nonredundant jud- 
gements. 



3 Examples 

In this section we list examples of the topology-oriented approach applied suc- 
cessfully. We place special attention on the topological requirement used in these 
applications. 



3.1 Incremental Construction of Voronoi Diagrams 

Let S = {Pi,P 2 , . . . ,Pri} be a set of finite number of points in the plane. The 
region R{S'Pj) defined by R{S',Pi) = {P G | d(P,Pi) < d(P,Pj), j = 
1, . . . , t — 1, j -h 1, . . . , n} is called the Voronoi region of P^, where d(P, Q) repre- 
sents the Euclidean distance between the two points P and Q. The partition of 
the plane into Voronoi regions R{S]Pi), i = 1,2, . . . ,n , and their boundaries is 
called the Voronoi diagram for S. 

In the incremental algorithm, we start with the Voronoi diagram for a few 
points, and modify it by adding the other points one by one. At each addition 
of a new point, a substructure of the Voronoi diagram is removed and a cyclic 
sequence of new edges are generated in order to represent the region for the new 
point. 

We place three additional points whose convex hull contains all the given 
points, and start the incremental construction with these three points. Then, 
the following properties should be satisfied. 

TR 3.1.1. At each additon of a new point, the substructure to be removed 
should be a tree in a graph theoretic sense. 

TR 3.1.2. Two Voronoi regions should not share two or more common edges. 

The software based on these topological requirements could construct the 
Voronoi diagram for one million points placed at random in single-precision 
floating-point arithmetic in 0(n) time on the average [24125] . Since this software 
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can generate even highly degenerate Voronoi diagrams, it can be used for the 
approximate computation of the Voronoi diagrams for general figures m- 

3.2 Divide-and- Conquer Construction of Delaunay Diagrams 

From the Voronoi diagram, we can generate another diagram by connecting two 
points by line segments if their Voronoi regions are adjacent. This diagram is 
called the Delaunay diagram. The construction of the Voronoi diagram and the 
construction of the Delaunay diagram are almost equivalent, because we can 
easily generate one from the other. 

The divide-and-conquer algorithm can construct the Delaunay diagram in 
O(nlogn) worst-case optical time. In this algorithm, the input points are divided 
into left and right halves recursively, and the Delaunay diagram for the left points 
and that for the right points are merged into the Delaunay diagram for the whole 
points. 

In the merge step, some edges are removed from the left and the right dia- 
grams, and new edges (called traverse edges) connecting the left and the right 
are generated. We can place the following requirements among others. 

TR 3.2.1. An edge should not be removed if the resulting diagram becomes 
disconnected. 

TR 3.2.2. An edge should not be removed if the edge is adjacent to traverse 
edges at both of the terminal vertices. 

TR 3.2.3. Parallel edges should not be generated. 

TR 3.2.4. An edge should not be generated if it violates the planarity of 
the diagram. 

These requirements were used in our computer program, which runs in 0(n log n) 
time for general input, and runs in 0(n) average time for uniformly distributed 

points PI- 

3.3 Construction of 3-d Voronoi Diagrams 

The Voronoi diagram can be defined in any dimensions in the same way as in 
the two-dimensional space. In the three-dimensional space, the Voronoi region 
is a convex polyhedron and the Voronoi diagram is the partition of the space 
into these polyhedra and their boundaries. This partition admits the following 
requirements. 

TR 3.3.1. A Voronoi region should be simply connected. 

TR 3.3.2. Two Voronoi regions should share at most one common polygonal 
face. 

These topological requirements were used in the computer program |8], in 
which the way of numerical computation was tuned up to improve the quality of 
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the output. The numerical difficulty we found in this experience was discussed 

in p2] . 



3.4 Intersection of Convex Polyhedra 

The problem of intersecting two convex polyhedra can be reduced to a series of 
problems of cutting a convex polyhedron by a plane. To cut a convex polyhedron 
by a plane, we have to partition the vertices into two groups, one belonging to 
one side of the cutting plane, and the other belonging to the opposite side. This 
partition should satisfy the following requirement. 

TR 3.4.1. The subgraph composed of the vertices belonging to each side of 
the cutting plane and the edges connecting them should form a connected graph. 

This topological requirement was used to construct a robust software |I9] . 



3.5 Three-Dimensional Convex Hull 

For a finite set S = {Pi, P 2 , . . . , P„} in the three-dimensional space, the convex 
hull, denoted by C{S) forms a convex polyhedron. 

In the divide-and-conquer method, we apply the following procedure recur- 
sively to construct C{S): first the set S is divided into the left half and the 
right half next the convex hulls C(S'l) and C{S^) are constructed, and finally 
C(S'l) and C{Sr) are merged into C{S). 

Suppose that we have already constructed C'(S'l) and C(S'r). Then, in the 
merge process, we first find common tangent line uv connecting vertex u of 
C(S'l) and vertex v of C(S'r). Next, we find a triangle having the vertices u, v 
and another vertex w of C'(S'l) or C(S'r) such that the triangle uvw is contained 
in a common tangent plane of C(5 'l) and C(S^). This triangle is called a bridging 
triangle. This bridging triangle has the other common tangent edge; this edge 
is uw if w is a vertex of C'(5 'r), whereas it is wv if w is a vertex of C(5 'l). We 
replace uv with this new common tangent edge, and repeat finding the next 
bridging triangle until we reach the initial common tangent edge. Finally, we 
remove the faces that are in the interior of the new surface, thus obtaining the 
convex hull C(S'l U S'r). The merge process can be executed in 0{n) time, and 
hence the whole algorithm runs in O(nlogn) time. 

A face or an edge of C(S'l) or of C(S'r) is said to be weakly visible if it is visible 
from at least one point of the other convex hull. A face that is not weakly visible 
is said to be invisible. A face of O(S'l) or C{Sr) is removed in the merge process 
if and only if it is weakly visible from the other convex hull. A weakly visible 
edge is said to be clearly visible if there is at least one point on the other convex 
hull from which the edge and the both side faces are visible simultaneously. 

Let Al be the set of all weakly visible open faces (meaning the faces excluding 
their boundaries) and all clearly visible edges of C'(S'l). Ar forms a connected 
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region, and the interior (i.e., the region obtained by removing the boundary from 
Xl) is simply connected (i.e., has no holes). If we trace the boundary of Xl in 
such a way that Xl is always to the left, we get a directed cycle, which we call 
the left silhouette cycle. The right silhouette cycle is defined similarly. Here we 
have the following requirements. 

TR 3.5.1. The silhouette cycle should not traverse the same edge twice in 
the same direction. 

TR 3.5.2. The silhouette edge should not cross over itself (i.e., should not 
travel from one side to the other of other part of the silhouette cycle) . 

TR 3.5.3. The feft and right silhouette cycles should be chosen in such a 
way that the total number of remaining vertices is at least four. 

Similar to the three-dimensional Voronoi diagram, the software based on 
these topological requirements also requires tuning the way of numerical com- 
putation if we want to get good quality output [TTTT2] . 

Gift-wrapping algorithm was also rewritten along the topology-oriented ap- 
proach po] . 



3.6 Voronoi Diagram for Line Segments 

For a finite number of mutually disjoint line segments in the plane, the region 
nearer to a line segment than to any others is called the Voronoi region of the line 
segment. The plane is partioned into the Voronoi regions of the line segments 
and their boundaries; this partition is called the Voronoi diagram for the line 
segments. The boundary edges of this Voronoi diagram contains parabolic curves, 
and hence any algorithm for this diagram is much more sensitive to numerical 
errors than the other examples we have seen above. 

In the incremental construction of this Voronoi diagram, we partition each 
line segment into the open line segment and the two terminal points, and con- 
struct the Voronoi diagram for all the terminal points first using one of the 
robust methods mentioned in the previous subsections. Next, we add the open 
line segments one by one and modify the diagram accordingly. In each addition 
of an open line segment, we remove some substructure from the old diagram 
and generate new edges to form the new region, where the following properties 
should be satisfied. 

TR 3.6.1. The substructure to be removed should form a tree in a graph 
theoretic sense. 

TR 3.6.2. The substructure to be removed should contain a path connecting 
the Voronoi regions of the two terminal points. 

TR 3.6.3. The newly generated edges should form a cycle that exactly enc- 
loses the substructure to be removed. 
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These topological requirements were used in our topology-oriented software 
I?]. The experiments showed that this software runs in 0(n) time on the average 
for a large class of input data. 

4 Concluding Remarks 

The basic idea of the topology-oriented approach to robust geometric computa- 
tion, and various implementation examples have been presented. The topology- 
oriented approach is a general principle for implementing geometric algorithms 
in a numerically robust manner. In this approach we first collect the topological 
requirements that should be satisfied by the geometric objects, and next rewrite 
the algorithms so that these requirements are always fulfilled. 

The resulting software has many good properties. First, it is completely ro- 
bust to numerical errors; no matter how large errors happen, the software never 
comes across inconsistency, and hence can pursue the task to the end, giving some 
output. Secondly, the output is at lease topologically consistent in the sense that 
the collected topological requirements are absolutely satisfied. Thirdly, the ou- 
tput converges to the true solution as the precision in computation increases. 
Fourthly, we need not worry about the degenerate situations at all in the im- 
plementation of the software, and hence the structure of the resulting software 
is very simple; it contains the procedure only for the general case. Recall that 
we start our implementation with the assumption that numerical errors are in- 
evitable. In this environment we cannot recognize degeneracy, and hence need 
not prepare any exceptional processing for degenerate cases. This may sound pa- 
radoxical, but actually if we accept the existence of errors, the implementation 
becomes simple. Fifthly, topology-oriented implementation does not require any 
condition on the arithmetic precision, and consequently can use floating-point 
arithmetic, which is fast compared with multiple precision rational arithmetic 
commonly used in the exact-arithmetic approaches. 

This work is partly supported by the Grant-in- Aid for Scientific Research of 
the Ministry of Education, Science, Sports and Culture of Japan. 
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Abstract. We present a randomized algorithm for approximating mul- 
ticast congestion (a generalization of path congestion) to within O(logn) 
times the best possible. Our main tools are a linear programming rela- 
xation and iterated randomized rounding. 



1 The Problem 

Let G = (V, E) represent a communication network with n vertices. A multi- 
cast request is specified as a subset of vertices, called terminals, that should be 
connected. In the multicast congestion problem, we are given several multicast 
requests. Si, , Sm C ]/. A solution to the problem is a set of m trees, where 
the tree spans the terminals of the multicast request. The congestion of 
an edge in a solution is the number of multicast trees that use the edge. The 
problem is to find a set of multicast trees that minimize the maximum congestion 
(over all the edges). 

The multicast congestion problem generalizes two well-known optimization 
problems. The special case where each request consists of just two terminals is 
the standard routing problem of finding integral paths with minimum congestion. 
The latter problem (which itself is a generalization of the problem of finding edge 
disjoint paths) is in Karp’s original list of NP-hard combinatorial problems [^, 
and hence our general problem is also NP-hard. On the other hand, integral path 
congestion can be approximated within a factor of O(logn) |^. Our main result 
is an algorithm for approximating the multicast congestion to within a factor of 
O(logn), thus placing it in the same level of approximability as the special case. 

The second problem that multicast congestion generalizes is the minimum 
Steiner tree problem which is the special case of a single multicast with a slightly 
different objective function. The goal is to minimize the sum rather than the 
maximum of the congestion over all edges. Finding a minimum Steiner tree 
is max-SNP hard, i.e. there is a constant e > 0 such that it is NP-hard to 
find a (1 -I- e) approximation [T] (and thus our problem is also max-SNP hard). 
The precise constant to which the Steiner tree can be approximated is an open 
problem. The best known upper bound is 1.598 |3]. 

* Santosh Vempala was supported by a Miller fellowship at U.C. Berkeley and an NSF 
CAREER award. Berthold Vocking was supported by a grant of the “Gemeinsames 
Hochschulsonderprogramm III von Bund und Landern” through the DAAD. 
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Our algorithm relies on a linear programming relaxation of the problem. The 
relaxation is motivated by the following observation: in any integral solution 
to the problem, there is at least one edge across any cut that separates two 
terminals belonging to the same multicast request. This can be viewed as a 
multicommodity flow for each multicast, with the objective of minimizing the 
net congestion. The LP formulation is described in detail in section [2l 

A solution to the LP can be viewed as a separate fractional solution for each 
multicast. Unfortunately, the fractional solution cannot be decomposed into a set 
of trees of fractional weights, which would allow to adapt the simple rounding 
algorithm used for the standard routing problem [Hj, i.e., for each multicast, 
choose one of the fractional trees at random with respect to the fractional weights 
of the tree. Instead we decompose the fractional solution for each multicast into 
a set of paths. These paths have the property that the total flow from each 
multicast terminal to other terminals in the same multicast is at least 1. We apply 
a variant of randomized rounding to these paths. A single iteration of randomized 
rounding, however, does not suffice, and we need to iterate randomized rounding 
on a subproblem till we find a solution. The rounding algorithm is presented in 
section El 

To prove the approximation guarantee, we make two observations. The first 
is that in each phase (i.e. one iteration) the expected congestion is no more 
than twice the fractional congestion (i.e. the congestion of the LP). Second, the 
total number of phases is at most log fc, where k is the maximum number of 
terminals in a single multicast, which is, of course, bounded by n. Since the 
fractional congestion in each phase is a lower bound on the integral congestion, 
the expected congestion of an edge over the entire algorithm is at most 0(log k) 
times the optimal congestion. Further, the congestion is concentrated around 
this expectation and a straightforward application of Chernoff bounds leads to 
an O(logn) approximation (the details are in section |4j. 

2 The Relaxation 

Let xte be a 0-1 variable indicating whether edge e is chosen for the multicast. 
Consider the following integer program (IP). 
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The constraints m ensure that any solution to the integer program connects 
the vertices in each multicast. Thus, this program corresponds exactly to our 
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problem. An LP relaxation is obtained by relaxing the constraints ([2]) to simply 
0 < Xte < 1- Let us refer to this relaxation as LPl. Although the program has 
exponentially many constraints, it can be solved in polynomial (in n) time by 
designing a separation oracle (j^) that checks whether the flow across every 
relevant cut is at least 1. 

In the case of the standard routing problem, i.e., two terminals in each mul- 
ticast, a fractional solution of LPl can be decomposed into several paths of 
fractional weight so that the sum of the weights of the fractional paths for each 
multicast is one, and the sum of the weights of the paths crossing an edge e 
corresponds to the weight of that edge, i.e., Therefore, choosing one of 

the fractional paths for each multicast at random (with respect to the weights) 
results in an integral solution in which the expected number of paths that cross 
an edge corresponds to the weight of that edge. Unfortunately, in the case of 
more than two terminals, the fractional solution cannot be be decomposed into 
a collection of fractional trees that add up to J2tXte, for each edge e. For this 
reason, it is unclear how to obtain an integral solution of bounded cost from 
LPl. 

Instead we consider a different relaxation LP2 which is at least as strong as 
the one above (i.e. the feasible region of the second relaxation is contained in 
the feasible region of the first relaxation). The relaxation is described in terms 
of a multicommodity flow between pairs of terminals of each multicast. In the 
relaxation, the variable denotes the flow between the terminals i and j 

in the multicast, and Xte{i,j) denotes the flow on edge e of commodity {i,j) 
in the multicast. 
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Theorem 1. min LPl < minLP2 < min IP 

Proof. Any solution to the multicast congestion problem (IP) can be mapped 
to a solution of LP2 with same objective value as follows. For each multicast, 
connect all nodes in the integral tree by a cycle so that each edge in the tree 
is included in the cycle exactly twice. Each pair of terminals that are neighbors 
in the cycle (i.e., terminals that are connected by a subpath of the cycle that 
does not include any other terminal) exchange a flow of weight Figure [T] gives 
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Fig. 1. Transformation of an IP solution to a solution of LP2. 



an example of this transformation. This construction defines a multicommodity 
flow along the edges of the tree so that the total flow along each edge is 1 and 
the total flow across any cut in the graph that separates two or more terminals 
is at least 1. Hence, the constraints of LP2 can be satisfied while preserving the 
objective value. 

Any solution of LP2 can be mapped to a solution of LPl with the same 
objective value by simply adding up all the flows on an edge, separately for each 
multicast. □ 

Theorem 2. The relaxation LP2 can be solved in polynomial time. 

Proof. We design a separation oracle for LP2. Given an assignment to the 
variables of LP2, for each multicast St, we can construct a complete graph Gt = 
{St, Et) where the weight of an edge {i,j) is ft{i,j)- Then constraints ([4|) simply 
require that every cut in this graph has weight at least 1. This can be easily 
checked by finding the minimum cut in the graph. The other constraint sets can 
be checked in polynomial time by examination. □ 

3 Iterated Randomized Rounding 

A solution to LP2 can be separated into fractional solutions for each multicast. 
We will now describe how to round these fractional solutions into trees, one for 
each multicast. The rounding is completely independent for each multicast and is 
achieved by the following procedure applied separately to the fractional solution 
for each multicast (i.e. the values of Xte{i,j), separately for each t). 

1. Decompose the fractional solution into flow paths. 

2. Choose one path randomly out of each multicast terminal with probability 
equal to the value of the flow on the path (randomized rounding). 

3. If the multicast terminals are all connected, then stop and output the solu- 
tion. 

4. Otherwise, contract the vertices corresponding to connected components, 
and form a new multicast problem with the contracted vertices as the new 
multicast terminals. 
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5. The original fractional solution is still a solution for the new multicast pro- 
blem. Repeat the above steps till all multicast terminals are connected. 



Lemma 1. There are at most logk iterations, where k is the maximum number 
of terminals in a multicast. 

Proof. In each iteration, each multicast terminal is connected to at least one 
other terminal. Thus, the number of components drops by at least a factor of 
two in each iteration. □ 



4 The Approximation Guarantee 

Let X/g be the random variable indicating whether an active edge e is selected 
for the multicast in the phase of the rounding. Define Xte = j 
i.e., the total flow through edge e of the multicast. 

Lemma 2. E[Xl^] < 2 • xte. 

Proof. The flow through the edge e for each commodity (i,j) is divided among 
a set of paths. The random variable is 1 if one of these paths is selected and 
zero otherwise. The expected value of is the sum of the probabilities of these 
paths being selected. Since the probability of a path being selected by any fixed 
one of its two end points is exactly the flow on it, the sum of the probabilities 
of all paths through e (for the t*^ multicast) is 2 • Xte. □ 

Theorem 3. With high probability, the congestion of the solution found by the 
algorithm is less than 0(logfc • OPT + logn). 

Proof. Fix an edge e. Let L(e) denote the number of multicasts including e. By 
definition, L{e) = iFrom Lemma |2] we can conclude that E[L{e)] < 

2 • Xte < 2 • log kOPT. We will apply a Chernoff bound to show that it is unlikely 
that L{e) deviates significantly from its expectation. 

For each multicast t, the random variables X/g are independent from random 
variables for other multicasts. Random variables for the same multicast, however, 
may be dependent, but these variables are negatively correlated, because an edge 
e that is chosen in round £ for a multicast will not be chosen in round £' > £ for 
the same multicast. As a consequence, we can apply a Chernoff bound (see e.g. 
g]). Set C > max{2eif [L(e)] , 2 • (a -|- 2) • log n} with a > 0 denoting an arbitrary 
constant. Then 

Prob[L(e) > C] < 2 -'^/^ < ^-a -2 ^ 

Summing this probability over all edges yields that the congestion does not 
exceed C = 0(log k ■ OPT + log n), with probability 1 — n““. □ 
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5 Open Questions 

An obvious open question is whether the approximation factor can be improved. 
It is worth noting that path congestion is conjectured to be approximable to 
within a factor of 2. In fact, the worst known gap for the LP relax;ation is a 
factor of 2. However, the best known approximation even for that special case 
is O(logn). 

The path congestion problem can actually be approximated to within a con- 
stant times the optimum plus O(logn) (which is also a multiplicative O(logn) 
when the congestion is a constant, but better nevertheless). A less ambitious 
open problem is whether the tree congestion can also be approximated to within 
a constant plus O(logn). 
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Abstract. For an edge weighted undirected graph G and an integer 
fe > 2, a k-wa.y cut is a set of edges whose removal leaves G with at 
least k components. We propose a simple approximation algorithm to 
the minimum fc-way cut problem. It computes a nearly optimal fc-way 
cut by using a set of minimum 3- way cuts. We show that the performance 
ratio of our algorithm is 2 — 3/fc for an odd k and 2 — (3fc — 4)/(fc^ — k) 
for an even k. The running time is 0(fcmn® log(n^/m)) where n and m 
are the numbers of vertices and edges respectively. 



1 Introduction 

Given a simple, edge weighted undirected graph G with n vertices and m edges, 
a fc-way cut is a set of edges whose removal leaves G with at least fc components. 
The minimum fc-way cut problem is to find a fc-way cut with the minimum 
weight (called a min-fc-way cut). This problem is one of the extensions of the 
classical minimum s,t-cut problem and has practical significance in the area of 
VLSI design [H] and parallel computing systems m- It is known that this pro- 
blem is NP-hard if fc is arbitrary [2]. For a fixed fc there are several polynomial 
time algorithms that solve it exactly. Dalhaus et al. [I] gave an 0{rP^^'>) time 
algorithm for planar graphs. For arbitrary graphs, Goldschmidt et al. |2] gave an 
0{n^ time algorithm, where F{m,n) stands for the running 

time of a maximum flow algorithm (e.g., O (mn login? /m)) due to |^). Karger et 
al. 1^ presented an log^ n) time randomized Monte Garlo algorithm. 

Recently in the article Kamidoi et al. proposed another deterministic algo- 
rithm and they claim that it can find an optimal solution in F{m,n)) 

time. Faster algorithms have been developed for special cases: [2j for fc = 3,4 
and [m] for fc = 5, 6. 

Since the problem for arbitrary fc is NP-hard, it is interesting to design appro- 
ximation algorithms that run in polynomial time. Saran et al. m gave two al- 
gorithms based on maximum flow computations. They showed that both of their 
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Education, Science, Sports and Culture of Japan. The first author was also supported 
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algorithms can achieve a ratio of 2 — 2/fc and have running times of 0{nF(m, n)) 
and 0{kF(m,n)) respectively. It is also natural to compute an approximation 
solution in the following greedy way. For a fixed j {2 < j < k), first divide the 
graph into j components by removing a min-j-way cut, then repeat removing a 
minimum weight edge set whose removal increases the number of components 
by at least j — 1 (the last iteration may increase a fewer number of components) 
until there are at least k components. Then the set of all removed edges is a 
/c-way cut. Intuitively one may guess that a larger j will take longer time but 
can yield a better ratio. In fact, Kapoor |lj showed that for j = 2 the ratio 
2 — 2/fc can be achieved in 0{kn{m + nlogn)) time. The same article [Ij also 
claimed that for any j > 3, a, ratio of 2 — j/k + {j — 2)/fc^ + 0{j/k^) can be 
achieved in polynomial time for fixed j. However, the proof of the correctness 
for J > 3 is not complete since it contains a lemma. Lemma 4.3 |1], which is 
not valid in general (as will be discussed in Sect. 13.4(1 . In this paper, we show 
that with a slight modification, the above algorithm with j = 3 can achieve a 
ratio of 2 — 3/fc for odd fc > 3 and 2 — (3fc — 4)/(fc^ — fc) for even fc > 2 (note 
that 2 — 3/fc < 2 — (3fc — 4)/(fc^ — fc) = 2 — 3/fc + 1/fc^ + 0(l/fc^)). Our algo- 
rithm needs at most fc/2 min-3-way cut computations. Since one computation 
takes 0{mn^log{n^ /m)) time due to [H|, the running time of our algorithm is 
0{kmn^ log(n^/m)). 

2 Preliminaries and Algorithm 

We use (G, w) to denote a network where G is a graph and w is an edge weight 
function. We denote the number of the components in G by comp{G). For two 
vertices V\ and V 2 , we denote the edge between v\ and V 2 by (^ 1 ,^ 2 ). For two 
vertex sets Vi and V 2 we use Eg(Vl,V 2 ) to denote the edge set {(^ 1 ,^ 2 ) G G | ui G 
Vi,V 2 e ¥ 2 }. Generally, we define EclVi, V 2 ,...,Vp) = Ui<j<j<p EciVi, V,) for 
vertex sets Vi, V 2 , ■ ■ ■ ,Vp. For an edge set E' , we use G — E' (resp., G -I- E') to 
denote the graph derived from G by removing (resp., adding) E' . 

For a network (G, w) with a real weight function w (note that G may not 
be connected), we want to find a minimum weight edge set F whose removal 
increases the number of components by at least 2. This problem can be reduced 
to a minimum 3-way cut problem in the following way. (It can also be solved via 
a DP approach, see [dj.) 

PROCEDURE INC2COMP(G,r(;) 

Input: A network (G,w) with a real weight function w. 

Output: A minimum weight edge set F, subject to comp{G — F) — comp{G) > 2. 

1. Gi, G 2 , ■ . ■ ,Cp -fr- the components in G; 

2. choose arbitrarily vertices G Gi, U 2 G G 2 , . . . , Up G Gp] 

3. EXTG^G+{{vi,v,)\j = 2,...,p}- 

4. w((ui,Wj)) ^ 00 for J = 2, . . . ,p; 

5. F a min-3-way cut in (EXTG,w). 



Approximating the Minimum fe-way Cut 



375 



(Note that if w > 0 then comp{G—F)—comp{G) = 2 holds.) Now we can describe 
our approximation algorithm for the minimum k-way cut problem. We suppose 
that the given graph is connected and the edge weight function is positive. For 
an even k, we will first compute a min-2-way cut before computing min 3-way 
cuts (in EXTG). (This is different from the algorithm in |4] for j = 3, which 
computes the min-3-way cuts first.) 

ALGORITHM 3-SPLIT(Go,u;,fc) 

Input: A connected network {Gq,w) with w > 0, and an integer k (2 < k < n). 
Output: An edge set H, such that comp{Go — H) > k. 

1. G^Go; 

2. if fc is even then 

3. F ^ a min-2-way cut in (G, w)] G G — F; F[ F; 

4. end 

5. while comp{G) < k do 

6. INC2COMP(G,'u;); 

7. G^G-F- H ^ HU F-, 

8. end 

Theorem 1. For a connected network {Gq,w) and an integer k, where Gq has 
n vertices and m edges, w > 0 and 2 < k < n, algorithm 3-SPLIT terminates 
in 0{kmn^ \og{n^ / m)) time. Let G* be a minimum k-way cut in (Go,w) and H 
be the output of 3-SPLIT. Then 

( (2 — 3/fc)w(G*) if k is odd, 

W(H) < \ 3fc-4 , . 

1 (2 — — r)w(G ) if k IS even. 

k^ — k 

□ 



3 Proof of Theorem 2.1 

3.1 Mainstream of the Proof 

By the assumption of w > 0, 3-SPLIT executes exactly \_k/2\ minimum 2 or 
3-way cut computations. Since one computation takes 0{mn^\og{n^ /m)) time 
||0], the stated running time follows. Next we consider the approximation ratio. 
Let F = Fi U • • • U Fyj./ 2 \ denote the k-way cut output of 3-SPLIT, where each 
Fi denotes the Fth edge set found by 3-SPLIT. Let Gj = Gi_i — Fi = Gq — (Fi U 
• • • U Fi). Let V be the vertex set of graph Gq. We say that V = {Vi, V 2 , ■ • ■ , Vp} 
is a p-way partition if all V) are nonempty, disjoint and Vi U V 2 U ■ ■ ■ U Vp = V 
holds. Define Eco{P) = Eco{V\,V 2 , . . . ,Vp) which is a p-way cut in Gq. Given 
a p-way cut G in Go, let V\,V 2 , . . . ,Vq be the vertex sets of components in 
Go — G. Define V(G) = {Pi, I 2 , • ■ ■ , K?}, which is a q-way partition. Note that 
|V(G)| = q = comp{Go — G) > p. 
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Definition 1. Given an odd k and a k-way partition V = {Vi, V 2 , . . . , V^} 
(as an ordered set), define Ai(V) = Eco(V 2 i-i,V 2 i,V — (¥ 21-1 U V 2 i)) for i = 
l,2,...,(k-l)/2. 

Definition 2. Given an even k and a k-way partition V = {Vi, V 2 , • . . , V^} 
(as an ordered set), define A\(V) = Eq,^(Vi,V — Vi) and Ai(V) = Eca(V 2 i- 2 , 
¥ 2 ^- 1 , V - (¥ 2^-2 U ¥ 2 ^-l)) for 1 = 2,..., k/2. 

Now we state the following two lemmas, based on which the proof of Theo- 
rem [I] is then given. We will prove the two lemmas in subsection 3.2 and 3.3. 

Lemma 1. Given an odd k and a eonnected network (Gq,w) where w > 0, for 
any k-way partition V = {¥\, . . . , ¥k}. 



ko ko 

w(H) = Y,w(F,)<Y.^(M'P)), ( 1 ) 

where k^ = (k — l)/2, and Ai(V) are defined in Definitional^ □ 

Lemma 2. Given an even k and a connected network (Gq,w) where w > 0, for 
any k-way partition V = {¥ 1 , . . . , T4}, 



ko ko 

w(H) = J2w(F,)<J2MM'P)), (2) 

i—1 i=l 

where fco = k/2, and AfiV) are defined in Definition]^ □ 

These lemmas tell that for any k-way partition V, the output of 3-SPLIT 
has a weight no more than '^(Afi'P)). We now estimate this sum in terms 
of the weight of the min-fc-way cut C*. Let V* = V(C*) = {¥f ,¥f , . . . ,¥^} 
(notice that \V*\ = k holds by w > 0 and the optimality of C*). Notice that 
Sf=i w(Ai(V*)) varies depending on the numbering of ¥f ,¥f , . . . ,¥^ . 

Lemma 3. For an odd k, there is a numbering of¥f, ¥f,..., ¥^ which satisfies 

ko 

^2'^(Ai('P*)) < (2 — S/k)w(C*), where ko = (k — l)/2. (3) 

i=l 

Proof Since Gq is undirected, we have J2i^iw(EGo(¥* ,¥ — ¥*)) = 2w(C*). 
Thus we can suppose without loss of generality that w(Egq(¥^ ,¥ —¥//)) > 
lw(C*). Let 13 = w(EGo(¥f,¥f, ..., ¥,*_,)) = w(C* - Ego(¥^, ¥ - ¥^)). Then 
we have 

(3<(l-\)w(C*). 

Let Q be the complete graph with 2fco (= ^ ~ 1) vertices U\,U 2 ,. ■ ■ , Uk-i- For 
all 1 < i < j < fc — 1, edge (ui,Uj) has weight w(Ego(¥* ,¥*)). Note that 
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the total weight of the edges in C? is /3 by this definition. For a numbering 
of V,*, Vi,..., VC where h = k (thus V* = {V^. ■ ■ ■ .VC_,.VC}), 

consider a perfect matching (uij, . . . , j| in Q. Let 

« = E (recall that A, {V*) = Eg, (l^4_^ , I /4 , V-{VC^_^ UF4 ))). 
Then the weight of the matching /x = Ejli rc(i?Go(^2j-i> ^2j)) satisfies a + ^i = 
w{C*) + P (where edges in Eg,{VC,V — V^) are counted once, and the other 
edges are counted twice). Consider a numbering that makes /x be maximum 
in Q. We show that /x > P/{k — 2 ). For two matching edges (?Xi2^_j, Ui,.) 
and Ui,P, there are four unmatched edges {ui,._.^,Ui,C), 

{ui2j,Ui2i_C) and (uiaxirtij,). Since the matching is maximum, w{ui,^_j^,Ui2i_P + 
w{ui,^,Ui.^P < w{ui.^^_^,Ui,^) +w{ui.^^_^,Ui,P and xx;(Mi2^_i,xXj2,) + xx;(Mi2^,?Xi2,_J 
< xc(xxi2^_j,xxi2^. ) + w{ui,i_^,Ui,i) hold. Thus the four unmatched edges have 
weight at most 2 {w{ui,^_^,Ui,^) + xc(xxi2,_j, itij,)). Thus the unmatched edges 
have weight at most 2 , u*2,) + w(u*2i-i > '“*21)) = 2 (*o - 1 ^- 

Therefore we have 



^ w{Ui,Uj) 
^ 2(A:o — 1) + 1 



p 

k-2' 



(4) 



Thus the corresponding numbering satisfies Ei^=i w{Ai{V*)) = w{C*) + P — fj. 
< 4C*) + (1 - ^)/3 < (2 - |)u;(C'*). □ 



In a similar way we can show the next lemma (an alternative proof can be 
found in m)- 

Lemma 4. For an even k, there is a numbering o/F]*, V^*, . . . , which satisfies 



Qi^ _ 4 

w{Ai{V*)) < (2 - where fcp = k/2. (5) 

i=l 

□ 



By these lemmas. Theorem [T] has been proved. 

3.2 Proof of Lemma [l] 

We proceed to prove Lemma [H by induction on k. It is trivial for k = S since 
El is a min-3-way cut and Ai is a 3- way cut in Gq. Suppose that Lemma [T] 
holds for fc = 2x — 1 . We consider the case of fc = 2i + 1 . First, if there exists a 
j G { 1 , 2, . . . , x} satisfying comp{Gi-i —Aj{'P)) — comp{Gi-i) > 2, then edge set 
AjpP) is a possible choice of Fi. By the optimality of w{Fi), we have w{Fi) < 

w{Aj{V)). By the induction hypothesis, we can assume that zxi(Fi)+xc(F' 2 )H f 

w{Fi_i) < w{Ai{V)) + ■ ■ • + w{Aj_i{V)) + x<;(Aj_|_i(P)) + . . . + w{Ai{V)) holds 
for a {2i - l)-way partition {Vi, . . . , V 2 J- 2 , V 2 j+i, ..., V 2 i, V 2 J -1 U V 2 j U I^ 2 *+i}, 
which implies (HJ. Therefore, in what follows, we consider the case in which 
comp{Gi-i — AjpP)) — comp{Gi-i) < 1 holds for all j = 1, 2, . . . , x. 
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Proposition 1. comp{Gi-i — Aj{V)) — comp{Gi-i) = 0 holds if and only if 
Aj{V) CFiU---UF,_i. 

Proof. Recall that Gi-i = Gq — {Fi U • • • U By the definition of AjifP) the 

proposition is clear. □ 

Proposition 2. If comp{Gi-i — Aj{'P)) — comp{Gi-i) = 1 , then there is exactly 
one subset W € V(Fi U • • • UFi_i) for which exactly one of the following (a) and 
(&) holds. 

(a) W C V2j-i U V2j and W n Pzj-i ^ 9 ^WnV 2 j. 

(b) Wr\V2j-i = 0 andWr\V2j 0 yf W—V2j {or symmetrically, \Vr\V2j = 0 
and W n V2J-1 yf 0 7^ Vh - V2J-1). 

Furthermore, for any W e V{Fi U • • • U Fi_i) — W, W fl V2J-1 7^ 0 {resp., 
W n V2j yf 0 ) implies W C V27-1 {resp., W C V2j). 

Proof. By Proposition [T] there is an edge (r'i,f2) G Aj{V) — {Fi U • • • U 
Let W € V(Fi U • • • U Fi_i) satisfy Vi,V2 € W. Edge (r;i,r'2) G Aj{P) means 
that at least one of 1^1, ^2 is in V2J-1 U V2j. Assume without loss of generality 
that v\ G V2j (thus V2 ^ V2j)- Since comp{Gi-i — Aj{P)) — comp{Gi-i) = 1 < 2 
is assumed, it is easy to see that if V2 G V2j_i then (a) holds, while if V2 ^ ^27-1 
then (6) holds. Notice that such W must be unique, otherwise comp{Gi-\ — 
Aj{V)) — comp{Gi-i) > 2 would hold. Thus for any W' G V(FiU- • ■\JFi-i) — W , 
W' n P2J-1 9 ^ 0 (resp., W' n V2j yf 0 ) implies W' C ^2^-1 (resp., W' C ¥2^). □ 

Proposition 3. If comp{Gi-i — Aj{'P)) — comp{Gi-i) < 1 holds for all j = 
1 , 2 , ... ,i, then there exist two indices ji,j2 G {1, 2 , . . . ,i} such that 

comp{Gi_i - {Aj^{V) UAj^{V))) - comp(Gi_i) > 2. (6) 

More precisely, there are subsets Wi, W2 G V(Fi U • • • U which satisfy one 

of the following cases {i), {ii), and {Hi) {symmetric cases are omitted). 

{i) Wi {resp., W2) satisfies {a) in Proposition\^with j = ji {resp., j = j'2), 
{ii) Wi {resp., W2) satisfies {a) {resp., (&)) in Proposition [3 with j = ji 
{resp., j = j2), 

{Hi) Wi {resp., W2) satisfies (6) in Proposition\^with j = ji {resp., j = J2), 
and either W± 7^ W2 or W\ = W2, W± — (V2ji U V272) 7^ 0 holds. 

Proof. Let Jo = { j \ comp{Gi-i — Aj{V)) — comp{Gi-i) = 0} and Ji = 
{1,2, ...,i} - Jq = { j \ comp{Gi_i - Aj{V)) - comp{Gi_i) = 1}. By Pro- 
positionm we see that Gi_i — U}=i = ^*-1 “UjgJi definition, 

comp{Gi-i — Uj=i > comp{Go — Uj=i ^ji'P)) > 2* + 1 holds. Thus 

comp{Gi-i - [J Aj{'P)) - comp{Gi-i) 

j&Ji 

i 

= comp{Gi-i — (J Aj{V)) — comp{Gi-i) > { 2 i + 1) — { 2 i — 1) = 2. ( 7 ) 
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(Notice that comp{Gi-i) = 2 i — 1 since w > 0 .) Thus by (| 7 ) there exist at least 
two indices j[, j'2 G J\ (that is, |Ji| > 2 ). By applying Proposition [ 2 ] to j = j[ 
and j'2, we see that there exist subsets G V(Fi U- • which satisfy 

one of the following cases (*'), {ii'), and (in') (symmetric cases are omitted). 

(z') W[ (resp., W2) satisfies (a) in Proposition E] with j = (resp., j'2), 

{ii') W{ (resp., W2) satisfies (a) (resp., (b)) in Proposition | 2 ] with j = 
(resp., j^), 

{in') W{ (resp., W2) satisfies (&) in Proposition El with j = (resp., j^)- 
If {i') or {ii') occurs, then it is clear that W[ ^ W2 and (El) holds for and 
j'2- Thus we can get Proposition [31 Similarly if {in') occurs with W[ ^ W2, or 
W[ = W2, Wi — (p2ii U ^2^2) 7^ 01 then Proposition [ 3 ] still holds. 

The only case that (E) does not hold is {Hi') with W( = W2 Q U 
(symmetric cases are omitted). In this case, by ()Z) we see that Ji ^ {ji, j^}- Let 
J3 G Ji — {j'i,j'2\- We show that and j'^ satisfy (EJ. 

By applying PropositionElto jg, we see that there is a bPg G V(FiU- • -UPi-i) 
which satisfies (a) or (&) in Proposition E] with j = jg. Since bPg ^ W[ (because 
tLa LI (L2 j'-i U V 2 j 0 0 and W{ C V2f^ U P2j' and jg j(, J2), we see that j( 
and jg satisfy (E|. Hence Proposition [ 3 ] holds. □ 

We will finish proving Lemma [I] by showing that dT| holds in all the three 
cases in Proposition El First in case (z), it is clear that the edge set Eco{Wi fl 
p2yi - 1 , bFi n Vjji ) U Eco ( W2 n V2J2 - 1 1 ^2 n V2J2 ) is a possible choice of Ei . Thus 

w{E,) < w{Ego{Wi n V2j,-i,Wi n P2,J u Eg,{W2 n V2,,-i,W2 n ^2,2)) 

< w{EGo{V 2 p-l,V 2 ji) U EGo{V 2 j 2 -l,V 2 j 2 ))- 

Consider a ( 2 z — l)-way partition V = {Vjy-i, V2j (for all j y^ ji, j2), ^2^1-1 U 
^2^1) ^2^2-1 LI V2J2, Vji+i}- By the induction hypothesis, we have 



1-1 i-l 

Y,w{E,) <Y,w{MV')) 

j=i i=i 

= ^ w{A,{V)) 

+zi;((Ajy(P) UAy^{P)) - {EGo{V2j^-i,V2jJ U F;Go(iL2y2-i) ^ 2 ^ 2 ))) 

i 

< Y^w{A,{V)) - w{Eg,{V2,,-1,V2,,) U Eg,{V2j,-1, C2,J). 

i=i 

Thus dU) (with ko = i) holds in this case. 

Similarly, in case {ii) we have 



W{P) < w{Eg,{w, n P2,y-1, n V2,,) u Eg,{W2 n ^2,2, w^2 - ^^2,2)) 

< w(^EGo{V2ji-i,V2jJ U Ego{V2J2,V - (P 2 J 2 -I LI b2yj)^ . 
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(Notice that W2 n V 2 J 2-1 = 0-) Apply the induction hypothesis on a ( 2 i — l)-way 
partition V' = {¥2^-1, V2J (for all j ^ ji,j2), U ^ 2^1 , ^ 2 ^ 2 - 1 : ^ 2^2 U V 2 *+i}- 

We have 



2-1 2-1 

Y^wiF,) 

i=i i=i 

= ^ w{MP)) + w(^{A,,{V)-Ego{V2,,-i,V2jJ) 
j^h,h 

U{A,,{V) - Ego(V2j,,V - (^2,2-1 U V^2,2)))) 

2 

< w{A,{V)) - w(EG,{V 2 i,-uV 2 ,,) U Eg,{V 2 j,,V - (^ 2 , 2-1 U V 2 jJ)) . 

i=i 

Thus dU (with ko = i) holds in this case, too. 

Finally in the case (in), in both cases of ITi = VF 2 and VFi ^ W 2 , we have 

W(E,) < w(Ego (VFi n F 2 jy , Wi - V 2 ,, ) U Ego (W 2 n F 2 J 2 , W 2 - ^ 2,2 )) 

< w(Eg„(V 2,,,V - (V2,,-l U V2,J) U Ego(V2j,,V - (^2,2-1 U 1 / 2 , 2 ))) 

< w{Aj^) + w{Aj^) - w{Ego{V2ji-1, V - V2,i-l) U Ego{V2j2-1,V - V2j2-i))- 

(Notice that IFi fl 1 / 21-3 = II 2 H 1^2i-i = 0-) Applying the induction hypothesis 
on a (2i - l)-way partition V = {V 2 j-i,V 2 j (for all j ^ ji, J 2 ), l2,i_i, l 2 , 2 -i: 
1 ^ 2,1 U V 2 j^ U V" 2 i+i}, we have 



i-l i-1 

Y^HEj) <Y.w{A,{V')) 

0=1 i=i 

= ^ u;(24,0P)) + u;(£;go(1"2,,-i,1^-1"2,,-i)UF;go(V^2,2-i,^-^2,2-i))- 

t/ii J2 

Thus dTj) (with ko = i) holds in this case too. 

Hence Lemma [U has been proved. □ 



3.3 Proof of Lemma [2] 

Analogously to the proof of Lemma H] we also proceed by induction on k. It is 
trivial for k = 2. Supposing that Lemma [5] holds for k = 2i — 2, we consider the 
case of k = 2i. First consider the case in which there exists a j S {1, 2, . . . ,*} 
satisfying comp{Gi-i — Aj{V)) — comp{Gi-i) > 2. If j > 1, then the proof is 
the same as for Lemma[T] If j = I, we have w{Ei) < w{Ai) by the optimality of 
F^. Consider a {2i - 2)-way partition {^ 2^-2 U l 2 *-i, 1^2, V 3 , • • ■ , 1^2*-3, 1^2i U Fl}. 
From the induction hypothesis we can easily get Q. Thus we only need to 
consider the case that comp(Gi-i — Aj{V)) — comp{Gi-i) < 1 holds for all 
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j = 1 , 2 , ... ,i. In a similar way as in proving Lemma [TJ we can show that there 
exist ji ^ j 2 &{l, 2 ,...,i} satisfying 

{ comp{Gi-i - Ajj (V)) - comp{Gi-i) = 1 , 

comp{Gi-i - Aj.^ {v)) - comp{Gi-i) = 1 , ( 8 ) 

comp{Gi_i - (Aj^(V) U AjfyP))) - comp{Gi_i) > 2. 

Suppose without loss of generality that fy fy 1- If ji fy 1 then the proof of 
Lemma [1] again applies. If j\ = 1, then there exists Wi £ V(Fi U • • • U Fi_i) 
such that 0 fy Vi n Wi fy Wi. Also, there exists W 2 € V(Fi U • • • U Fi_i) which 
satisfies one of the following cases (i) - (ii). 

(i) W2 Q 2 U V 2 j 2 — 1 and W2 H V2J2— 2 fy 0 fy H ^2 LI V2J2— i- 
(ii) IV2 n IL2^2-2 = 0 and W2 fl ^2^2-1 fy 0 fy W2 - ^2^2-1 (or symmetrically, 
W 2 n V 2 j,-i = 0 and W 2 n F 2 J 2-2 fy 0 fy VL 2 - V 2 y,- 2 ). 

In case (i), we have w(Fi) < w(Ai) + w(FGo(V2j2-2,V2j2-i)). Then it is 
easy to show Cl) by applying induction hypothesis to a ( 2 i — 2)-way partition 
{1L2.2-2 U V 2 j 2 - 1 ,V 2 ,- 2 , V 2 J-I (for J fy 1, J 2 ), U F 2 J. 

In case (ii), we have w(Fi) < w(Ai U EGo(V2j2-i, V - (^2^2-2 U V"2j2-i)))- 
Then (j2) follows from a (2i— 2)-way partition {V 2 J 2 - 2 , V 2 j- 2 ,V 2 j-i (for j fy 1, J 2 ), 
Vi U V 2 J 2-1 U V 2 i}- Thus we have proved Lemma □ 



3.4 Remarks 

The correctness of the algorithm |4] (which is described in Sect. 1) relies on the 
inequality (j2j (or an extended form for j > 4), as claimed in Lemma 4.3 [4]. 
We remark that the property of Lemma |2] is no longer valid if we first compute 
Fl,. . . , Fka-i as min-3-way cuts (e.g., by INC2COMP), and then compute Fk„ 
as a min-2-way cut in EXTG. This is illustrated by the next example. Let fc = 4. 
Consider a graph with vertices {a, b, c, d, e} and edges {(a, b), (b, c), (c, d), (d, e), 
(e,c)j. Edge (b,c) has weight 1.5 and others have weight 1. If we compute a 
min-3-way cut first, then we have Ei = {(a, b), (b, c)} and E2 = {(c,d),(d,e)}. 
Consider a 4-way partition {{d}, {e}, {a}, {b, c}} (thus Ai = {(c, d), (d, e), (e, c)} 
and A 2 = {( 0 , 6 )}), we have 

w(Ei) + w(E 2 ) = 2.5-I-2>3-I-1= w(Ai) + w(A 2 ). 

(In fact, the algorithm |Tj constructs a fc-way cut in this way for j = 3 and an 
even k. Thus Lemma 4.3 jl] is not valid in this case.) 

There is also an example to show that such an extension of Lemma[T](as intro- 
duced in Lemma 4.3 m is no longer valid. For j = 4 and k = 7, let G be a graph 
with vertices {a,b,c,d,e, f,g,h} and edges {(a, 6 ), (b,c), (c,d), (d,b), (d,e), 
(e, /), (f,g), (g,e.), (g,h)}, where (d,e) has weight 3 and others have 2. In the 
similar way of 3-SPLIT (or the algorithm |4]), we first get a min-4-way cut Fi = 
{(a,b),(d,e),(g,h)}, and then E 2 = {(b,c),(c,d),(d,b),(e, f),(e,g)}. One may 
expect that the next inequality holds for any 7- way partition {Vi, V 2 , • ■ • > Ify}- 



w(Fi) + w(F 2) < w(Ai) -I- ^(^2), 
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where Ai = Ea{Vi, V 2 , V3, V4 U V5 U Ve U V7)), and A 2 = Ec{V^, V5, Vq, V 1 UV 2 U 
V3UV7). However, consider a 7-way partition {{a}, {6}, {c}, {/}, {g}, {h}, {d, e}} 
(thus Ai = {{a,b),{b,c),{b,d),{c,d)} and A 2 = {(e, /), (e, g), (/, g), {g,h)}). We 
see that 

w{Ei) + w{E 2 ) = 7-1-10 >8-1-8 = w(Hi) -I- w{A 2 ). 

4 Conclusion 

We have shown that, via repeated applications of minimum 3-way cuts we can 
obtain a fc-way cut whose weight is no more than 2 — 3/A: (resp., 2 — (3A: — 
4)/(A:^ — k)) times of the optimal for odd k (resp. even k). It is not difficult to 
show that the ratios are tight. Can this be improved if we compute a A:- way cut 
via minimum j-way cuts with j > 4? It seems that a different approach is needed 
for general j > 4. 
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Abstract. We consider the online competitiveness for scheduling a set 
of communication jobs (best described in terms of a weighted graph 
where nodes denote the communication agents and edges denote com- 
munication jobs and three weights associated with each edge denote its 
length, release time, and deadline, respectively), where each node can 
only send or receive one message at a time. A job is accepted if it is 
scheduled without interruption in the time interval corresponding to its 
length between release time and deadline. We want to maximize the sum 
of the length of the accepted jobs. When an algorithm is not able to 
preempt (i.e., abort) jobs in service in order to make room for better 
jobs, previous lower bound shows that no algorithm can guarantee any 
constant competitive ratio. We examine a natural variant in which jobs 
can be aborted and each aborted job can be rescheduled from start (cal- 
led restart). We present simple algorithms under the assumptions on job 
length: 2-competitive algorithm for unit jobs under the discrete model of 
time and (6-I-4- -\/2 « 11.656)-competitive algorithm for jobs of arbitrary 
length. These upper bounds are compensated by the lower bounds 1.5, 
8 — £, respectively. 



1 Introduction 

In parallel and distributed environment the following scheduling problem arises: 
We have a set of communication jobs, best described in terms of a weighted 
graph. Each node denotes the communication agent, edge denotes communica- 
tion job, and the weight of an edge denotes the time required for transmission. 
It is implied that each node can only send or receive one message at a time, but 
messages between different nodes can overlap; thus, a simultaneous transmission 
of messages between several node pairs is a matching of the graph. One special 
case of this scheduling problem is modeled as a bipartite graph. In this case, 
one node set denotes the senders, the other set the receivers, and a simultaneous 
transmission of messages is a bipartite matching. This model arises in many 
applications of switching systems including SS/TDMA network [bl4] and I/O 
platform (where one side of a bipartite graph is the set of disks and the other 
side is the set of I/O processors) for scheduling I/O requests |9|8|10j . 

* Supported in part by KOSEF grant 98-0102-07-01-3. 
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One interesting variant here is the one in which jobs have individual release 
times and deadlines. The problem is online as we assume that an algorithm has 
no knowledge about the existence of any job until the release time. A job is 
accepted if it is scheduled without interruption in the time interval between its 
release time and its deadline. We want to maximize the sum of the length of the 
accepted jobs. Though jobs, to be accepted, must be scheduled without interrup- 
tion, jobs can be preempted (i.e., abort) to make room for better jobs. Aborted 
jobs can be rescheduled from the start as if it has been never scheduled (i.e., 
restart in m)- Note that aborts and restarts are meaningless in off-line sche- 
duling j1 2J . (More precisely, in traditional off-line scheduling, preemptive jobs 
can be executed piece by piece and non-preemptive jobs are executed without 
interruption. In online scheduling, there is an in-between one: jobs can abort 
but we count only the executions without interruption of jobs.) However, in our 
problem, the introduction of abort and restart is quite natural, because without 
abort, no algorithm can guarantee any constant competitive ratio. We call this 
problem ONLINE REALTIME SCHEDULING, defined formally as follows: 

Given a graph G = (V,E), each job Jj that occurs along an edge Cj is a 
triple of non-negative real numbers {rj,pj,dj), where Vj is the release time 
or arrival time of the job, pj is the length of the processing time and dj is the 
deadline. We refer to the expiration time, Xj = dj —pj, as the latest time at 
which a job can be started while still meeting its deadline. We require that 
two jobs adjacent to the same node cannot be scheduled simultaneously at 
a time. The gain of a schedule a, which we denote as ||(t||, is the sum of the 
length of the jobs accepted by cr. A problem instance is given in Figure H] 
where each job Jj is denoted by a triple (rj,pj,dj) associated with an edge. 




Fig. 1. An instance of the ONLINE REALTIME SCHEDULING problem. 



One special case of ONLINE REALTIME SCHEDULING is obtained by 
adopting the discrete model of time. In this case, each job is represented as a 
triple of non-negative integers and jobs are scheduled at the discrete times. Such 
a problem naturally arises in the SS/TDMA switching and synchronous I/O 
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scheduling. Especially, packet switching in ATM networks involves the discrete 
time model and the messages of unit length, which is discussed in Section 2. 

We will measure the performance of an online scheduling algorithm by com- 
paring the gain of the algorithm to the gain of the optimal off-line algorithm 
Opt that knows the entire input jobs in advance. We will say that an online 
algorithm A is c-competitive if gainopt(I) < c- gainA{^), for all input instance 
I. It is easily seen that a natural off-line version of the ONLINE REALTIME 
SCHEDULING is NP-hard in its most restricted form (proof is omitted in this 
abstract). Thus any c-competitive algorithm is also a c-approximation algorithm 
for an NP-hard problem. 



1.1 Our Results 

We first present a simple greedy algorithm for unit jobs under the discrete model 
of time and show that it is 2-competitive. Next, we consider jobs of arbitrary 
length and present a (6 -I- 4- \/2 fa 11.656)-competitive algorithm. Our algorithm 
repeatedly computes the maximum matching with small change that comprises 
two opposing goals: maximum matching and minimization of aborts of scheduled 
jobs. This upper bound is compensated by the lower bound 8— e, which is obtained 
by extending the lower bound 4 in the interval scheduling problem m- 



1.2 Related Work 

There have been many researches on the problem of scheduling communication 
jobs without deadlines. In general, they adopted the discrete model of time and 
aimed to minimize the total time required for the completion of all communica- 
tion jobs. Copal and Wong |6] formulated this problem, inspired by SS/TDMA 
switches, and gave an heuristic algorithm to minimize the total switching time 
in SS/TDMA switches. Recently, Crescenzi, et al. [Ij presented 2-approximation 
algorithms to find a preemptive schedule that minimizes the sum of switching 
times and the number of switchings and showed the lower bound of the approxi- 
mation ratio g. Jain, et al. mm studied many variants of this problem raised 
in I/O platform. However, in these works, neither deadline constraints nor online 
algorithms were considered. 

Our problem is an extension of the interval scheduling problem, which was 
introduced by Woeginger |13| . In the interval scheduling problem, entire system 
includes only two communication nodes (thus, every communication jobs occur 
between the two nodes and at most one communication job can be scheduled at 
a time) and the expiration time Xj of a job Jj equals the release time Xj (usually 
termed as a job with no slack time Xj — Xj). The second restriction implies 
that each job is either scheduled at its release time or never scheduled. Thus 
our problem is the one that extends the interval scheduling problem into two 
directions: arbitrary slack time and conflict constraint in that jobs adjacent 
with the same node cannot be scheduled simultaneously. As observed in |^, 



386 



J.-H. Lee and K.-Y. Chwa 



existence of slack acts as a double-edged sword for competitive analysis. It is 
because the added flexibility helps not only an online algorithm but also Opt. 

Another researches on the interval scheduling problem assume that once a 
job begins running, the job cannot be aborted (see j1 1 |5j ). Recently, variants of 
online scheduling of realtime communications dealt with unit jobs, focusing on 
the congestion in linear network, trees, and meshes | 2I1| and selection of requests 
among alternatives [2]. 

2 Unit Jobs Under the Discrete Time Model 

In this section, we assume the discrete time model and consider the jobs of unit 
length. Naturally, our goal is to maximize the number of jobs that are scheduled 
between their release times and deadlines. The algorithm given here is very 
simple - it computes a maximum matching in each time slot - and achieves a 
competitive ratio of 2. 



Shelf -based-Max-Matching (SMM) 

A — %] {A denotes the set of available jobs, not yet scheduled. } 
for A: = 1 to 00 

Insert into A newly-arrived jobs; 

Compute a maximum matching M for jobs in A; 

Remove the jobs in M from A; 

Remove the jobs whose deadline is k from A; 



Theorem 1. The algorithm SMM is 2- competitive. 

Proof. Let D = ma,Xjdj. For i,j such that 1 <i < j < D, let SMM{i,j) denote 
the set of jobs scheduled in time interval [i,j]. Analogously, OPT{i,j) denote 
the set of jobs scheduled by OPT. For a set of jobs, | • | denotes its cardinality. 
Since SMM(i,i) is a maximum matching, its cardinality is best possible if we 
ignore the decisions in the previous steps. Thus, the following is immediate: 

\SM M{i,i)\ > \Opt{i, i) — SMM{1, i — 1)| 

Therefore, by summing both sides, 

\SMM{l,k)\=i:'l=i\SMM{i,i)\ 

>YH=i\Opt{i,i)-SMM{l,i-l)\ 

= Ei=i \Opt{i,i) \ - Y!1 =i \Opt{iA) n SMM{l,i- 1)| 
> Ei=i \Opt{i,i)\ - \SMM{i,i)\ 

The last inequality needs some explanation: For different i’s, Opt{i, i)(lSM]VI{l, i— 
1) are disjoint each other, and thus their union is a subset of SMM{1, k), Then 
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the cardinality of this union, i) H SMM{1, i — 1)|, is no larger than 

that of SMM{1, k), which proves the last inequality. Therefore, we have 

2-\SMM{l,D)\ > \Opt{l,D)\ 



completing the proof. 



□ 



Theorem 2. No online algorithm for unit jobs is better than 1.5 -competitive. 

Proof. Suppose that a job Ji = (1,1,2) arises along an edge {v,w) and a job 
J 2 = (1,1,2) arises along an edge {v,w') initially. Any online algorithm can 
schedule only one of Ji and J 2 at time 1. Without loss of generality, assume 
that Ji was scheduled at 1. Then, the adversary gives the next job J 3 = (2, 1, 2) 
along an edge {v',w'). At time 2 , any online algorithm can schedule only one of 
J 2 and J 3 . In total, at most two jobs are scheduled. However, for the same jobs, 
OPT can schedule all of them, which implies that no online algorithm can be 
better than 1.5-competitive. □ 



3 Jobs of Arbitrary Length 

This section considers jobs of arbitrary length and adopts the continuous model 
of time. We present an 11.656-competitive algorithm and shows no algorithm can 
be better than (8 — e)-competitive. We let r{Jj),p{Jj),d{Jj) denote rj,pj,dj. 

First we discuss the upper bound. Main challenge in this case, compared 
with the unit jobs, is that some running job should be aborted to make room 
for longer jobs. For example, suppose a job Ji with p{Ji) = 2 is scheduled at t 
and a job Jj with p{Jj) = k and d{Jj) = t + fc + 1 arrives at t + 1 and conflicts 
with Ji. In order to be better than |-competitive, any strategy would abort Ji 
and schedule Jj. 

The algorithm given here is similar with that of the previous section - compu- 
tes a maximum weighted matching whenever a new job arrives or some running 
job completes its execution. Main modification is to abort running jobs that are 
in conflict with ’sufficiently long’ ]ohs. We formally define the ’long’ johs: Fix a 
constant C (> 1). Let M be a matching at a certain time. If a job Je is not in M, 
is in conflict with jobs0 Ji, Jj in M and satisfies p{Je) > C ■ {p{Ji)-\-p{Jj)), then 
we call Je an evicteroi Ji, Jj for M . (Of course, in case that J^ is in conflict with 
only one job, say Ji, and p{Je) > C ■ p{Ji), Je is an evicter of Ji.) The matching 
we want to find is not simply a maximum matching but a special matching M 
such that no evicter exists for M. To implement this matching, we first define 
new weight w of jobs as follows: For each running job J scheduled earlier, let 
w{J) = C ■ p{J). And, for other jobs available, let w{J) = p{J). Next, we com- 
pute a maximum weight matching with the jobs in M and A and weights w. 
This modified weight w is used only in this routine. 

It is clear that an edge can conflict with at most two edges of a matching. 
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General-Shelf -based-Max-Matching (GSMM(C)) 
initialize A = %\ F = 0; M = 0; 

for (each job J that is newly arrived or completed its execution) do 
Insert into A newly-arrived jobs; 

{ Let M be the current matching } 

Remove from M completed jobs; 

For each job J in M, let w{J) = C ■ p{J); 

For each job J in A, let w{J) = p{J)', 

Compute a new maximum matching M for jobs in M and A, using w, 
Remove the jobs in M from A] 



A job is said to be accepted if it is succefully completed; otherwise, rejected. 
Note that some rejected job can be scheduled and aborted several times. For a 
set S of jobs, we let p{S) denote the sum of the length of the jobs in S. 

Theorem 3. The algorithm GSMM(C) is (§jif + 2C + S) -competitive. 

Proof. Let S be the set of jobs accepted by GSMM(C), and let R be the set 
of jobs accepted by Opt but rejected by GSMM(C). We would like to bound 
p{R) with respect to p{S). Let i?i C i? be the set of jobs that are (partially) 
scheduled at least once by GSMM(C) and R 2 C R he the set of jobs that are 
never scheduled by GSMM(C). 

We first bound p{Ri). Suppose that GSMM(C) schedules a job e £ Ri and 
later abort it at t. We fix an attention to the iteration of for loop in GSMM(C) 
at t. Then we claim as follows: 

Claim Let Rt is the set of jobs aborted at t and N{Rt) be the set of jobs that 
are scheduled at t and in conflict with some jobs in Rt. Then, we have that 

• p{N{R')) > C -p{R'), for every subset R' C Rt. 

• it is possible to divide each job e £ Rt into two fragments Ci, 62 and each 
job / £ N{Rt) into two fragments /i ,/2 so that (i) p{ei) -£ p{e 2 ) = p{e), 
and (ii) p(/i) +p(/ 2 ) = p{f): and (Hi) each fragment e' of any job e £ Rt 
corresponds one-to-one to a fragment /' of some job / € N{Rt) and satisfy 
p{f) > C ■ p{e'). We call /' the evicter of e'. 

Proof of Claim: Omitted. 

Each job may be scheduled and aborted several times and we consider all aborts 
of each job. We assume R\ is a multiset whose element is (a scheduling of) jobs 
in R. By the claim we can divide each job in R\ into two fragments and satisfy 
that each fragment of a aborted job corresponds one-to-one to its evicter, which 
is at least C-times longer. It should be mentioned that specific scheduling time of 
each fragment is unimportant since we are only interested in the length bound. 

Now we define an abort-diagram D, represented by a rooted tree whose node 
set consists of the jobs in S (scheduled by GSMM(C)) and the fragments of the 
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jobs in i?i. The root of 13 is a job e in S. If e consists of two fragments (by 
the procedure of Claim), we assume the root is their union. We construct D 
by repeatedly applying the following rule. (During the construction, we further 
divide the fragments, if necessary.) For each fragment / of a job in i?i, if some 
x-portion e' of its evicter is in D, then we further divide / into its x-portion and 
remainder and add the x-portion of / to 13 as a child of e' . (Note that e' is at 
least C-times longer than the x-portion of /.) It is easily seen that the diagram 
Z3 is a tree. See Figure |2] 




In the same fashion, we can construct all abort-diagrams for each job in S. 
Each job in R\ is divided into a number of fragments, each of which is pushed 
into some diagrams. 

For analyses, let us fix attention to a specific abort-diagram D. Suppose that 
e is the root of D. The sum of the length of two children is at most 1/C of that 
of their parent. Therefore, the sum of the length of all descendents is no larger 
than -times p(e). Summation of the above equation over all abort-diagram 
leads to that 

p{Ri) < 

Next, we consider i? 2 , the set of jobs in R that are never scheduled. Recall 
that jobs in R are scheduled by Opt. Consider a job e G i ?2 and suppose that Opt 
schedules e in time interval [t, t+t{ek)\ - Since GSMM(C) never schedule e, it must 
be that GSMM(C) schedules other jobs, say e' and e", before t that are in conflict 
with e and keeps them against e at t. In other words, w{e') + w{e") > w{e) at t. 

By taking a = and j3 = 1—a, we call e' the a-keeper of e and e" the 

/3-keeper of e. In this case, we conceptually divide the job e into two fragments, 
its a portion and /3 portion. Observe that if a fragment is the x-keeper of e, then 
its length is no less than Indeed otherwise, t{e')+t{e") < Then, from 

the definition of w, w{e') + w{e") < w{e), which is a contradiction. 

In this way, we divide each job in R 2 into two fragments. Then each fragment 
in i ?2 bas its unique keeper. If a job e is a keeper, then it must be scheduled 
(possibly aborted later). Thus, each keeper is in S' or i?i. First suppose that a 
keeper e is in S. Let K{e) be the set of fragments in R 2 whose keeper is e. Since 
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Opt can schedule at most two jobs simultaneously that conflict with e and the 
length of each of them is at most C-times p{e), the sum of the length of these 
fragments in K{e) is at most (2C + 2)-times t{e). 

Next, suppose that a keeper e is in Ri. Let K'{e) be the set of fragments in 
i ?2 whose keeper is e. Though this case is similar with K{e), since e is aborted by 
some e', the sum of the length of these fragments in K'{e) is at most (C+2)-times 
t{e). (More precisely, let e = (i,j) and e' = Jobs in K{e) are of the form 

(z, *) or Latter jobs can conflict with the jobs in K'{e'). Therefore, the 

sum of the length of the jobs in i ?2 whose keeper is in i?i is at best (C + 2)-times 
the length of their keeper.) Thus we have that 

p{R2) < (2C + 2) • p{S) + (C + 2) • p(i?i) 



Therefore, 



p{Ri) + p(i?2) = [{C + 3) • p{Ri) + (2C + 2) • p{S) 



— [q + 2C + 2] • p{S) 

Formally, let Opt{R) (resp, Opt{S)) denote the set of the jobs in R (resp, S) 
that are scheduled by Opt. 

p{Opt{R))<[^^ + 2C + 2]-p{S) 

Moreover, since all jobs in S are somehow scheduled by GSMM(C), we have that 

p{Opt{S))<p{S) 



Therefore, 



\\Opt\\< + 2C + 3] • \\GSMM{C)\\ 

completing the proof. □ 

By optimizing C, the competitive ratio of Theorem Elbecomes 6+4-\/2(Ri 11.656), 
when C = 1 + y/2. 

Theorem 4. The competitive ratio of the ONLINE REALTIME SCHEDULING 
problem is at most 6 + 4 • \/2(Ri 11.656). 

We now turn to the lower bound. 

Theorem 5. In the ONLINE REALTIME SCHEDULING problem, the com- 
petitive ratio of any algorithm is at least 8 — e. 
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Proof. Our proof is an extension of Woeginger’s given in [13] , which established 
a lower bound of 4 — e for the interval scheduling problem. We assume that all 
jobs have no slack time, as in m- Recall that in the interval scheduling problem, 
entire system includes only two communication nodes and thus, every communi- 
cation jobs occur between these two nodes and at most one communication job 
can be scheduled at one instant. The proof for the interval scheduling problem 
should be of this form: The adversary generates the intervals Ji, J 2 , • • • , Jn- If 
any online algorithm accepts intervals J( , J 2 , • • • , J '^, , then Opt can accept in- 
tervals J", J'f, • • • , J”„ such that Yl\=i ^ (4 — e) where e is an 

arbitrary constant. 

In the sequel, we modify the proof of [13| to obtain the lower bound of 8 — e. 
First, we consider a very simple bipartite graph with node sets {1,2} in both 
sides, instead of two nodes in the interval scheduling. Next, we simulate each 
interval Ji with three jobs, called job set. 

Definition of job sets: For the interval Ji with the release time r{Ji) and 
deadline d{Ji) and length p{Ji) (recall that in the interval scheduling problem, 
p(Ji) = d{Ji) — r{Ji)), the adversary generates three jobs Ki = (T^p, Ti_ 2 , 
as a unit; let Si = ^, where S is sufficiently small constant. 

— The time interval [r( Jj j), d{Tij)] of the job Tij G Ki fulfills 

r(T,,,) = r{J,) + ^-^S, {j = 2,3), = d(T,, 2 ) = d{T,,s) = d{Ji), 

p(Tij) = d(r,j) -r(Tij) 

— The node pairs between which the jobs in Ki occur depend on the behavior 
of an online algorithm H . Suppose that currently scheduled job is located 
at (1,1) and current time is r{Ti^i). Then, job Ti_i is given at (1,1) and 
Ti ^2 is given to (1,2) (at r(T,^ 2 ))- H the algorithm H schedules the job Ti^ 2 > 
then is given to (2,2). Otherwise Ti^^ is given to (2,1). Note that any 
online algorithm H can execute at most one job at one instance, whereas 
the off-line algorithm can schedule two jobs. 

— Some properties from the above: If a job in Ki is of length v, then any other 
job in Ki is at least v — |(5i; and sum of the length of two jobs in Ki is at 
least 2v — Si. 

The remaining proof is straightforward: If any online algorithm schedules jobs 
in job sets K[,K' 2 , - ■ ■ , K '^, , then its gain is at most 1 ) ■ H we consider 

the corresponding interval scheduling problem, some online algorithm schedules 
the intervals J[, J 2 , ■ ■ ■ , J'^' and Opt can schedule some intervals Jf, J 2 , • • • , J"/ 

that satisfies Y^^=iP{J'i) ^ (4 — f^i)Y^\=iP{J'i)- l^O'^ ®ach interval J", Opt can 
schedule at least two jobs in each job set iF", whose sum is at least 2p{J[') — Si. 
Thus the gain of Opt is at least (8 — 2ei) Y^\=iP{'^'i) ~ 2^- By letting e\ = |, we 
showed that no online algorithm can be better than (8 — e)-competitive. □ 



Acknowledgement grateful to Oh-Heum Kwon for motivating this work. 
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Abstract. Given a set of sources and a set of sinks in the two dimen- 
sional grid of size n, the disjoint paths (DP) problem is to connect every 
source to a distinct sink by a set of edge- disjoint paths. Let v be the 
total number of sources and sinks. In |^, Chan and Chin showed that 
without loss of generality we can assume v < n < 4v^. They also showed 
how to compress the grid optimally to a dynamic network (structure 
of the network may change depending on the paths found currently) of 
size 0{^/nv), and solve the problem in 0{^/nv^^^) time using augmen- 
ting path method in maximum flow. In this paper, we improve the time 
complexity of solving the DP problem to The factor of im- 

provement is as large as yY when n is 0{v), while it is at least for n 
is &{v^). 



1 Introduction 

Given a set of sources S and a set of sinks T in the two dimensional grid, the 
disjoint paths (DP) problem is to connect every source in S' to a distinct sink 
in T (assuming |S| < |T|) by a set of edge-disjoint paths. Note that a source 
can be connected to any sink. Suppose |S| -I- |T| = v. It has been shown in [3] 
that the size of the grid is bounded by n for u < n < 4v^ after an 0{v) time 
preprocessing. 

In practice, the DP problem is a generalization of the breakout routing pro- 
blem [H] in printed circuit board and the single-layer routing problem for pin/bail 
grid array packages mms!. When T is the set of all boundary vertices of the 
grid, this problem is known as the eseape problem [1] or the reeonfiguration pro- 
blem [IMIKKIllIllTni on VLSI/WSI processor arrays in the presence of faulty 
processors. 

To find the set of edge-disjoint paths, we usually reduce the DP problem 
to the maximum flow problem in a unit capacity network. The network can be 
constructed by adding to the grid a super-source s and a super-sink t which 
connect to all the sources in S and all the sinks in T respectively. In [^, Chan 
and Chin observe the sparseness of the v sources and sinks in the grid of size n, 
and then introduce a “compression” technique which compresses the network of 

* The research is partially supported by a Hong Kong RGC grant 338/065/0022. 



A. Aggarwal, C. Pandu Rangan (Eds.): ISAAC’99, LNCS 1741, pp. 393-[40^ 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



394 



W.-T. Chan, F.Y.L. Chin, and H.-F. Ting 



size 0(n) to a dynamic network of size The compression is initiated 

by deleting a set of disjoint and empty rectangles (i.e., source and sink free). 
(See Figure [H) Each p x q empty rectangle in the set is then replaced with a 
reachability graph of size 0{p + q) to restore the connectivity of the network. 
Basically, in the resulting network the algorithm finds a maximum flow which 
gives a solution to the DP problem. The augmenting paths method [6] is used 
as the framework of the algorithm. In each iteration, an augmenting path (an 
s-t path in the residual network) is found and the affected reachability graphs 
are updated. (See Figure El) The algorithm finds the maximum flow in 0{v) 
iterations. The running time of the algorithm can be divided into two categories. 
The first category is the time to update the reachability graphs. In an iterations, 
all the reachability graphs may need to be updated for the worst case. Since the 
total number of vertices and edges in the reachability graphs is bounded by the 
network size, which is 0{y/nv), the total update time is 0{y/nv^^^) in the 0{v) 
iterations. The second category is the time to find augmenting paths. Since the 
upper bound of time to find an augmenting path is the size of the network, it 
takes 0{y/nv^^'^) time to find the 0{v) augmenting paths. Therefore, the running 
time of the augmenting paths method for the DP problem is 0{y/nv^^'^ ) 0 [ 3 !. 

In this paper, we improve the time for both the update of reachability graphs 
and the search of augmenting paths. First, we observe that if the flow is augmen- 
ted along the shortest s-t paths, we can bound the total number of updates for 
the reachability graphs to 0{y/TW log v). Also, we need to limit the size of indivi- 
dual reachability graph. For this reason, we give a new procedure which isolates 
a set of disjoint and empty rectangles each with size at most \/n/v x \Jnjv. 
Thus each reachability graph has size 0{^/nJv), and the total update time for 
the reachability graphs is reduced from to 0{y/n/vy/nvlogv), i.e., 

0(n log u). (It is an improvement because nlogv < for v < n < 4u^.) 

On the other hand, we extend the traditional layered network technique to our 
dynamic network for finding augmenting paths. In the traditional layered net- 
work approach, we repeatedly add a blocking flow in the layered network to the 
current flow until we find the maximum flow. Owing to the unit capacity net- 
work, the blocking flow in the layered network can be considered as a maximal 
set P of disjoint and shortest s-t paths in the residual network. The layered 
network approach is efficient in unit capacity network because the path set P 
can be found and the flow can be augmented along all paths in P in 0(m) time 
where m is the number of edges in the network. Also, we can yield the maximum 
flow after 0{y/m) rounds, i.e., in 0(m^/m) time. However, since our network is 
dynamic, i.e., the number of vertices (in the reachability graphs) may change 
after augmenting a flow along any path in P, the same layered network can- 
not be used to find more than one augmenting path. Thus, we cannot directly 
apply the layered network technique and improve the running time 

t The result given in [3] is expressed in 0{dy/ mnN) where d is the maximum flow and 
\/ mnN is the size of the dynamic network. In this paper, for the ease of comparison, 
the same formula is rewritten as where v is the maximum flow and y/rvv is 

the size of the dynamic network. 
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immediately. Later in this paper, we will show how to build a different layered 
network which may contain some dynamic components. This layered network is 
built using a different measurement of distance between two vertices. In spite of 
the dynamic nature in the layered network, we can find the path set P (under 
the new distance measure) in the same layered network with some modification 
in 0{-\/rLv) time. Moreover, we can repeat this process to get the maximum flow 
in 0{-\fnv) rounds. As a result, we improve the time complexity of solving the 
DP problem to It is faster than the previous 0{y/nv^/'^) result by 

a factor of This factor is as large as ^ when n is 0{v), while it is at 

least for n is 0{v^). 

Using the techniques presented in this paper, we can also improve the time 
complexity of the algorithm in solving the DP problem with vertex-disjoint pa- 
ths |3] to Since our procedure in isolating the disjoint and empty 

rectangles is simple and efficient, our algorithm should perform well in practice 
(especially if the push-relabel algorithm [Z] is implemented instead of the layered 
network approach). On the other hand, there are two reasons showing that it 
is difficult to further improve the time complexity. For one reason, 

since our compression of network from 0{n) to 0{\/rt/v) is optimal |3], it is not 
possible to apply our extended layered network technique to the DP problem 
with a smaller network. For another reason, we know that the layered network 
algorithm which runs in 0{my/m) time is still the fastest known algorithm for 
finding maximum flow in unit capacity and sparse network. (See [Sj for more de- 
tails.) Considering that our dynamic network is also a sparse network where m is 
0{y/nv), we believe that further improvement on the 0{m^/m) or 0(n^/^u^/^) 
bound on the DP problem may require new insights in the grid structure or new 
techniques in finding maximum flow in sparse networks. 

2 Preliminaries 

2.1 Layered Network and Partial Flow 

Let G = (U, F) be a network with s,t G V and each edge has unit capacity. A flow 
in G can be regarded as a set of edge-disjoint s-t paths. Let Gf he the residual 
network jl] induced by a flow /. Define Sf{u) to be the shortest distanc^ from 
s to u in G/. A layered network is a subgraph of G/ containing only the vertices 
reachable from s and only those edges (u,w) such that Sf{w) = Sf{u) -|- 1. A 
blocking flow / is a flow in G such that every s-t path in the network contains 
an edge (u,w) with f{u,w) = 1. 

Let G be a set of disjoint and empty rectangles in the grid. Note that no two 
rectangles in G share the same vertex. An uncovered edge is a grid edge outside 
all rectangles (we denote “rectangle” for an empty rectangle) in G. Both end 
vertices of an uncovered edge are the uncovered vertices. A partial flow is a flow 
defined only on the uncovered edges but not on the edges inside any rectangle in 
G (Figure [T|). Since a rectangle does not contain any source or sink, the partial 



^ Each edge is assumed unit distance. 
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• source 
X sink 

I I empty rectangle 

■*— flow (path) 

augmenting path 



Fig. 1. Example of an isolation of disjoint empty rectangles, a partial flow and an 
augmenting path. 



flow / at every rectangle should conserved, i.e., the net flow into every rectangle 
is zero. 

Given a rectangle, an h-cut hci denotes the set of edges connecting between 
the ith row of vertices and the {i + l)th row of vertices in the rectangle. A v-cut 
vci denotes the set of edges connecting between the fth column of vertices and 
the {i + l)th column of vertices. The capacity of hci is \hci\, and it equals the 
width of the rectangle. Given a partial flow /, the demand of hci is defined to be 
the absolute value of the net flow into the rectangle on or above the ith row. For 
example in FigureEJa), the net flow across hc\ is 1. The capacity and demand of 
a v-cut are defined similarly. An h-cut or v-cut is saturated if its demand equals 
its capacity and oversaturated if its demand is larger than its capacity. 

In order for a partial flow / to be developed into a “complete” flow by having 
the flow values (edge-disjoint paths) defined inside all rectangles, each rectangle 
should satisfy the following necessary and sufficient conditions. 




(a) 



(b) 



(c) 



Fig. 2. A step of augmentation and update in the 4x4 empty rectangle B in Figure[T] 
(a) There are 3 units of flow entering the rectangle and 3 units of flow leaving the rec- 
tangle. (b) An augmenting path passes through the corresponding reachability graph, 
(c) Update to a new reachability graph. 
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Lemma 1 ( (3] ). Given apxq rectangle and a partial flow f , there exists a set 
of edge-disjoint paths inside the rectangle matching the flow entering and leaving 
the rectangle if and only if all the hci for 1 < i < p— 1 and vcj for 1 < j < q—1 
are not oversaturated. 



2.2 The Reachability Graphs in Residual Networks 

For a rectangle, the corresponding reachability graph retains all of the rectangle 
boundary vertices for connection with the grid vertices outside the rectangle and 
maintains a set of internal vertices to represent the connections between the 
boundary vertices. The characteristic of a reachability graph is that whenever a 
boundary vertex (3 is connected to another boundary vertex (3' , the flow can be 
augmented from j3 to (3' through the rectangle such that none of the h-cuts nor 
v-cuts is oversaturated. The structure of a reachability graph depends on how 
the partial flow / passes through the rectangle and eventually depends on the 
existence, quantity and location of the saturated cuts in the rectangle. There are 
three different kinds of structures for the reachability graphs, whose sizes are all 
linear to the number of boundary vertices (Figure]^. 

Case (a). Neither of the h-cut nor v-cut is saturated (Figure |^a)): 
The reachability graph is a star graph which has a path between any pair of 
boundary vertices. 

Case (b). Either some h-cuts or v-cuts (but not both) are saturated 
(Figure l^b)): The rectangle is partitioned into rectangular components by the 
saturated cuts. The part of the reachability graph representing a component is 
a star graph same as that in Case (a). The direction of an edge between two 
star graphs is opposite to that of the saturated cuts between the corresponding 
components. It forces an augmenting path to go only from one boundary vertex 
to another in the opposite direction of the saturated cuts. 

Case (c). Both saturated h-cuts aud v-cuts exist (Figure [31(c) ) : The 
rectangle is again partitioned into components by the saturated cuts. In this 
case, only the boundary components will be represented by a star graphs same 
as that in Case (a). Similar to the edges in Case (b), an edge between two star 
graphs has the direction opposing the corresponding saturated cut. 

3 The Algorithm for Edge-Disjoint Paths 

The algorithm starts by isolating a set of rectangles such that the number of 
uncovered vertices and edges are 0{y/rvv). 

3.1 Isolatiou of Rectangles 

In the isolation procedure presented in |^, it might output rectangles of size 

X c for some constant c which dominates the total update time of the reach- 
ability graphs. In this section, we present another algorithm which isolates the 
rectangles so that the size of each rectangle output is at most \pnjv x yjnjv. 
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directed edge in the reachability graph 
flow into or out of the block 
— saturated cut 

O internal node 



Fig. 3. The reachability graphs for Cases (a), (b) and (c) 



Without loss of generality, assume the grid is ni by ri 2 (i.e., n = niri 2 ), and 
ni and ri 2 are multiples of \pnjv. Algorithm 1 presents the procedure which 
isolates a set of rectangles such that the number of uncovered vertices and edges 
is 0(\Aui). An example using the procedure to isolate rectangles is shown in 
Figure [T1 Now we analyze the performance of the procedure. Firstly, the length 
of each side of the output rectangles is less than or equal to \fnjv. Secondly, we 
realize that the uncovered edges are the edges removed in the procedure. As we 
have removed \J n\vlni — 1 h-cuts of n-i edges each, sjn^vjnx — 1 v-cuts of ni 
edges each, and at most 0{v) rows of edges (the rows that contains sources or the 
rectangles in C that contains only one row) in the \pnjv x ^Jn/v rectangles, the 
number of uncovered edges is 0{i/nv). Since the number of uncovered vertices is 
linearly proportional to the number of uncovered edges, the number of uncovered 
vertices is also 0{y/Tw). 

3.2 The Algorithm 

Denote the dynamic network induced by the partial flow / (where each 
rectangle in the grid is replaced by a corresponding reachability graph) . Since the 
structures of some reachability graphs may change drastically after augmenting 
merely one s-t path, we may not And more than one augmenting path in the same 
G^. Therefore, if we build a layered network L from G^ based on the conventional 
distance measure, L may have to be rebuilt after adding only one path. To 
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Procedure: Isolate(5, T) 

Input: an m x ri 2 grid, a set of sources S and a set of sinks T. 

Output: a set of empty rectangles C. 

Let k = ^Jnlv; 

Partition the grid into a set oi k x k rectangles C by removing 

(1) the edges in the h-cuts hck, hc2k, • • •, hcn^-k and 

(2) the edges in the v-cuts vck, vC2k, ■ ■ ■, vCn2-k', 
foreach rectangle R E C do 

if R contains some sources or sinks then 

Partition R by removing the rows of vertices (and edges incident to the 
vertices) that contain the sources or sinks in R; 

Remove the rectangles in C that contain only one row of vertices; 



Algorithm 1 : Isolate(S', T) 



improve the time complexity, we define a proper distance measure so that we 
can reuse (major part of) the layered network after adding a path. Let p be a path 
between two uncovered vertices in G Define the length of p to be the number 
of uncovered edges in p plus the number of rectangles that p passes through. The 
distance of an uncovered vertices w from s, denoted by 5^{w), is the length of 
the shortest path from s to w in the new definition. The new layered network L 
contains all uncovered vertices reachable from s and the uncovered edges (u, w) 
for S^{w) = 6j{u) + 1. Moreover, L also contains the path h = {ui,U 2 , ■ ■ ■ ,Uk) 
of a reachability graph if u \ , Uk are two boundary vertices of the corresponding 
rectangle with 5j{uk) = Sj{ui) + 1, and U 2 , ua, . . . , Uk-i are the internal vertices 
of the reachability graph. After L is constructed, we find a blocking flow in L 
(by New-blocking in Algorithm 2) similar to the conventional method but 
we have to update (reconstruct) the affected reachability graphs once a flow is 
augmented. 



Procedure: New-blocking(L) 

Input: the layered network L defined above. 

repeat 

Find the shortest s-t path p in L based on the new distance measure; 
Augment the flow along p\ 

Update (reconstruct) every reachability graph that p passes through; 
Update L by the new reachability graphs; 
until s and t in L are disconnected', 



Algorithm 2 : New-blocking(L) 



We need some new definitions for the correctness proof below. Given a path 
h = (ui,U 2 , . . . , Ufc) in a reachability graph of G^, suppose u\,Uk are two bound- 
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ary vertices of a rectangle and all M 2 , W 3 , ■ ■ ■ , Ufe-i are the internal vertices in the 
corresponding reachability graph. We consider (ui,Uk) (and h) as a virtual edge 
of G Also, we redefine a blocking flow f in L as a partial flow such that every 

shortest s-t path in L contains either an uncovered edge (it, w) with f'{u, ic) = 1 
or a virtual edge (u, w) such that if /' is added to the current flow, one of the 
h-cuts or v-cuts in the rectangle is saturated from u to w. 

Lemma EJshows that the distance of all uncovered vertices is nondecreasing if 
current flow is augmented along one of the shortest paths in G This enables the 
layered network to be used to And more than one augmenting paths. Lemma |5] 
argues that the distance of t is strictly increasing in each round of finding blocking 
flow. Let f + f' he the flow after adding /' to /. 

Lemma 2. Let f be a partial flow along one of the shortest s-t paths in G^. 
We have S^{u) < <5^_|_j/(it) for all vertices u reachable by s. 

Proof. Let p be one of the shortest s-t paths in G^. Obviously, each edge (un- 
covered or virtual) (it, ic) in Gj has Sj{w) < S^{u) -\- 1. We shall show that each 
edge (u,w) in G^_^_p also has S^{w) < S^{u) -I- 1. Each uncovered edge in G^_^_p 
is either an edge ha G^ or is the reverse of an edge in p. For each virtual edge 
(it, ic) in rectangle R of Gj_^_p, either (it,u>) is a virtual edge in Gp or there 

were saturated cuts in i? of G j in the direction from u to w and /' is augmented 
against all those cuts. For instance, assume p contains a virtual edge {w’,u') in 
R opposite to the direction of the only saturated cut which is from it to ic. There 
must also exist another two virtual edges (ic', w) and (it, it') in i? of Gj, and thus 
Sj{w) < 5j(w') -I- 1 = dj{u') < Sj{u) -\- 1. The situation can be generalized for 
more saturated cuts between it and w. 

Since S^{s) = (s) = 0, we can deduce by a simple induction that <5j(it) < 

Sj_^_p{u) for all vertices u reachable by s. □ 



Lemma 3. Let f be a blocking flow in the layered network L of Gj. We have 
^/(^) < + 

Proof. Since /' is a partial flow on the edges of some of the shortest s-t paths, 
we have (by Lemma H]) S^{u) < Sj_^_p{u) for all vertices u reachable by s. For 
instance, Sj{t) < S^^p{t). Suppose Sj{t) = S^_^_p{t). Let p be a shortest s-t path 
in Gjj^p. We have for all edge (u,w) in p that -I- 1 = and 

hence S^{u) -I- 1 = S^(w). That means p is in L and it contradicts to the property 
that at least one of the uncovered or virtual edges in p do not appear in G . 
Hence Sj{t) < S^_^_p{t). □ 

By Lemma E] we can bound the number of rounds in the algorithm by a 
similar approach as given in [S] . 

Theorem 1. The algorithm terminates in at most rounds. 



Proof. Consider after the first -^nv rounds The shortest s-t distance is at least 
^fnv. As the size of G^ is 0{y/nv), there are at most \fnv), i.e., O(-^nv) 

edge-disjoint paths in G^. Thus, there are at most 0{f/^) rounds left. □ 
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4 Time Analysis 



The following lemma reveals the running time of the algorithm in a round. 

Lemma 4. Procedure New-blocking finds and augments a blocking flow in 
0{y/nv) + 0{ab) time where a and b are the number and sizes of the reachability 
graphs that have been reconstructed in the round respectively. 



Proof. Each uncovered edge is traversed once and then removed because of back- 
tracking or saturation. After traversing a virtual edge, the algorithm either back- 
tracks from that edge, or augments flow through the rectangle whose reachability 
graph will then need to be reconstructed. Thus, the time spent on the uncovered 
edges and the edges in the original reachability graphs is 0{^/nv), while the time 
spent on the edges in the reconstructed reachability graphs is 0(ab). □ 

The following lemma and corollary bound the value of ^ a over all rounds 
in the algorithm. 

Lemma 5. The sum of path length of all the augmenting paths found in the 
algorithm is 0{y/mi\ogv). 



Proof. Let and e* be the number of augmenting paths (i.e., shortest edge- 
disjoint s-t paths) found in the *th round and all rounds in the algorithm res- 
pectively. For instance in the first round, we And Ci augmenting paths and each 
of them has length at most y/nvfe*. In the ith round, we And augmenting 
paths and each of them has length at most ^Jrwj{e* — J2j<i ^j)- Therefore the 
sum of path length is 

111 1 

— + !■“;: + ^ 1 + 

e* e* e* — ei e* — ei 

V ' ' 1 ^ i. 

62 

< \/m) log e* < ^/nv log v. □ 



E Ci 
p* — 



i>i 



^j<i 



Corollary 1. At most 0{y/nvlogv) reachability graphs are reeonstructed in the 
algorithm. 

Proof. Flow is augmented along at most 0{y/rivlogv) virtual edges. □ 

Lemma 6. The algorithm takes 0(r?^'^v^l'^ b-^nv log v) time to find the ma- 
ximum partial flow. 

Proof. The running time of the algorithm includes the time to (1) isolate the 
rectangles (0(n) time), (2) construct the 0{-^iw) layered networks (Theorem[T} 
and And the blocking flows in each layered network, and (3) construct the 
edge-disjoint paths inside each rectangle after all rounds {0{n) time). In the 
worst case, the running time is dominated by Step (2) which is 0{rt'l'^v^l'^ -\- 
b^J^ log v) . □ 
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If the size of a rectangle is ^/n x c (for some constants c) as given in [^, 5 

will be 0{y/n). Thus, no improvement on the time complexity can be achieved. 

However, with our new algorithm to isolate the rectangles in grid f Section |3. 111 . 

the size of each rectangle is at most ^Jnjv x ^JnJv, i.e., b = 0{^Jn/v). In 

conclusion, we have the following theorem. 

Theorem 2. The algorithm solves the DP problem in time. 
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Abstract. We consider the following one- and two-dimensional bucke- 
ting problems: Given a set S' of n points in or R^ and a positive 
integer b, distribute the points of S into b equal-size buckets so that the 
maximum number of points in a bucket is minimized. Suppose at most 
(n/6) -I- A points lies in each bucket in an optimal solution. We pre- 
sent algorithms whose time complexities depend on b and A. No prior 
knowledge of A is necessary for our algorithms. 

For the one-dimensional problem, we give a deterministic algorithm that 
achieves a running time of 0{b'^ A^ logn -|- n). For the two-dimensional 
problem, we present a Monte-Carlo algorithm that runs in sub-quadratic 
time for certain values of b and A. The previous algorithms, by Asano 
and Tokuyama PQ, searched the entire parameterized space and required 
f2(n^) time in the worst case even for constant values of b and A. 



1 Introduction 

We consider geometric optimization problems that do not seem to have any nice 
properties like convexity and have a large number of distinct global optimal solu- 
tions. Consequently, it is hard to develop a search strategy that will avoid looking 
at all the optimum solutions (or more likely near-optimal solutions). However, 
if the number of optimal solutions are few, we may be able to prune the search- 
space. This may lead to more efficient algorithms that are “output-sensitive” 
where the notion of output is related to the number of optimal solutions. Since 
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we do not know the optimum solution to begin with, we can try to estimate that 
by some means, say random-sampling, and then use that to prune the search 
space. The success of such an approach depends on how effectively we can prune 
the search space. 

In this paper we consider the problem of partitioning a set of points in or 

into equal-size buckets, so that the maximum number of points in a bucket 
is minimized. The first problem that we consider is the following: Given a set S 
of n real numbers and an integer 1 < 6 < n, partition S uniformly into b equal 
sized buckets, i.e., each bucket has the same width. The buckets are defined by 
real numbers f3i = L + i ■ w, for 0 < i < b where L is the left endpoint of the 
left-most bucket and w is the width (size) of the buckets. The f-th bucket Bi 
is defined by the interval [/3i,/3i+i) and S D Bi is the content of the f-th bucket 
(for a fixed choice of L and w). We wish to minimize the maximum size of the 
contents in buckets. Two version of this problem are studied: (i) the tight case 
in which Bi and Bb are required to be nonempty, and (ii) the relaxed case in 
which they are allowed to be empty. 



(i) 




h 





(ii) 







• \ ,» 



(Hi) 



Fig. 1. (i) One-dimensional bucketing problem; (ii) uniform-projection problem; (iii) 
two-dimensional partitioning problem. 

Next, we consider the two-dimensional problem. Given a set S' of n points in 

and an integer 6 < n, we again wish to partition S into b equal-size buckets so 
that the maximum number of points in a bucket is minimized. We consider two 
types of buckets. First, we consider the case in which the buckets are formed 
by equally spaced 6-1-1 parallel lines, io, . . . ,ib, with orientation 9, for some 
0 G §^. We require S to lie between £i and ib and each of £i,£b contains at 
least one point of S. The buckets are 6 strips defined by consecutive lines £i-i 
and £i {1 < i < b); see Figure [T] (ii). This bucketing problem is known as the 
uniform-projection problem. We next define buckets to be the regions formed by 
two families of equally-spaced Vb -\- 1 lines. The extremal lines in both families 
are required to contain at least one point of S; see Figure [T] (iii). This problem 
is called the two-dimensional partition problem. 

Asano and Tokuyama [Tj describe O(n^) and 0(6^n^)-time algorithms for the 
tight and relaxed cases of the one-dimensional problem. We are able to obtain 
an 0(6^Z\^nlogn) deterministic algorithm for the tight case and 0{b^ A^nlogn) 
algorithm for the relaxed case. The running time of our algorithms is 0(n) for 
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constant values of b and A. The algorithm itself does not require the value of 
A; the value is required only for the analysis. This problem has applications to 
construction of optimal hash functions [1]. 

Comer and O’Donnell |3] described an algorithm for the uniform-projection 
problem that runs in 0{bn?log{bn)) time using 0{n^ + bn) spce. Asano and 
Tokuyama [T] gave an 0{n^ log n)-time algorithm, which uses 0{n) space, by 
exploiting the dual transformation of the problem. They also give alternative im- 
plementations that could be better for smaller b, but the worst-case running time 
is I7(n^) even for constant values of b. Bhattacharya |2] also gave an alternate 
approach for this problem, using the angle-sweep method. We first describe a de- 
terministic log^^^ n)-time algorithm for b — 2 for the uniform-projection 

problem, thus improving the quadratic upper-bound. For larger values of b, we 
describe a Monte Carlo algorithm that computes an optimal solution in time 
0{mm{n^/^b^log^^^ n-\-b^Anlogn,n^}), with probability at least 1 — 1/n. The 
running time increases to 0(min{&n^^^ log^^^ n b'^An log n, n^})) if we restrict 
space to be linear. The dependence of running time on A is borne out by the 
fact that the number of possible optimal configurations (having the same value) 
increases. 

Our overall approach for both the problems is similar. Namely, we use a 
sample to “localize” the search for the global optimum. Although intuitively, this 
is a good heuristic, analyzing the bound on the number of “potential” candidates 
for the global optimum, from the optima of the sample, is rather technical. In 
the one-dimensional problem, we can simply choose a a “deterministic” sample 
because the elements are linearly ordered, but the two-dimensional algorithms 
rely on random sampling. 



2 Optimal One-Dimensional Cuts 



For a set S of real numbers Xi, 1 < i < n and an integer b, a pair c = (L, w) is 
called a cut if the set of 6 -I- 1 real numbers Pj, 0 < j < b defined as Po = L and 
Pj = L-\-j-w for j > 0 are such that Pq < x\ < Xn < Pb- The interval [Pj-i,Pj) 
is called the j-th bucket and the set of Xi’s lying (strictly) in this interval is the 
contents of the j-th bucket. We will denote the j-th bucket by Bj and the size 
of its contents, fl 5|, by for a cut c. 

Let C be the set of all cuts. The optimal cut-value, ^{S), is defined as 
mincgc{maxi<j<b{|Bj|“}} and any cut that achieves this is an optimal cut. If 
we restrict the cuts to satisfy the condition that {Bp > 1, then it is called 
a tight cut. That is, the first and the last buckets cannot be empty. An optimal 
tight cut is defined analogously as above (restricted to the set of tight cuts). We 
will first describe an algorithm for finding an optimal tight cut. 



Definition 1 Two cuts c\ and C 2 are comhinatorially distinct iff there are 
buckets Bi and Bj such that and \Bj\^^ ^ 



406 P.K. Agarwal, B.K. Bhattacharya, and S. Sen 



We parameterize the problem as follows. We represent each cut c = {w, L) as 

a point in the plane. Let £ = {xi = L + jw | 1 < i < n, 0 < j < 6} be the set of 

(5+ l)n lines in the (£, w)-plane, which we refer to as the event lines. £ consists 
of & + 1 families of parallel lines (one for each fixed j), each family containing 
n lines; see Figure [2] (i). Hence, every face in A{C) contains at most 2(6+ 1) 
edges. For all cuts c = (w, L) lying in the same face / of A{C) the cut-value 
remains the same; we will denote this value by <P{f, S). Let ^j{f, S) = |Hj(S')|'^ 
for any c G f. The non-empty condition of extreme buckets implies that we have 
to consider only those cuts (ic, L) that lie in the quadrilateral Q defined by the 
intersection of the following four constraints. 

— L x^ — L . . 

Q : Xi > L > Xi — w and — - — < w < — — . (1) 

The above constraint also leads to the following lemma whose proof is omitted 
from this version. 

Lemma 1. For every point Xi G S there exists an integer 1 < j <6—1, such 
that Xi lies in one of the two buckets Bj or Bj^i for any tight cut. 

This lemma immediately implies that at most n lines of £ intersect Q and 
thus Q intersects Ofnf) faces of A{C). The lines of £ that intersect Q can be 
determined in 0{bn) time. We can therefore search over QC\A{C) in O(n^) time 
to find all combinatorially distinct optimal cuts. 

Lemma 2. For a set of m points, all the combinatorially distinct optimal cuts 
can be computed in 0{mf) time. 





Fig. 2. (i) Set £ and the feasible region Q; (ii) Shaded regions denote C22,C2S, and 
C24, and the dark region denotes C(2, 4; 2), the set of cuts for which {x2, 0:3, *4} lie in 
the second bucket B2. 

For an integer r > 1, let i? C S' be the subset of r points obtained by choosing 
every n/r-th point of S. From our previous observation about directly solving 
the problem, we can compute the optimal solution for R in O(r^) time. 
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Lemma 3. Let no,ro be the maximum size of a hueket in an optimal solution 
for S and R, respeetively. Then |^ ~ ^| < 7 - 

Proof. Let c be an optimal cut for R. Each bucket of c contains at most rp 
points. Since R is chosen by selecting every (n/r)th point of S, each bucket of 
c contains at most (rp -I- l)n/r — 1 points of S. Therefore np/n < (rp -|- l)/r. A 
similar argument shows that np/n > (rp — l)/r. □ 

We now describe the algorithm for computing an optimal solution for S, 
assuming that we have already computed the value of rp. Let Cij denote the 
set of points c = (L,w) in the (L,rc)-plane so that the point Xj G S lies in the 
bucket Bi of the cut c. Then Cij = {{L,w) \ L + {i — l)u> < Xj < L + iw} is 
the cone with apex at (0, Xj). Given three integers 1 < / < r < n and 1 < i < 6, 
the set of points in the (L, w)-plane for which the subset {xi,xij-i, . . . , Xr} of S 
lies in the zth bucket Bi is C{1, r; i) = (Xj=i ^ij- ’’i *) is a cone formed by the 
intersection of the halfplanes L + {i — 1)?/; < xi and L + iw > Xr- 

By LemmaEl (rp — l)n/r < np < (rp-|-l)n/r. On the other hand np > n/b. We 
perform a binary search in the interval [max{|"(rp — l)n/r] ,n/ 6 }, [(rp -|- l)n/rj] 
to find an optimal cut for S. At each step of the binary search, given an integer 
m, we want to determine whether np < m. 




Fig. 3. The boundary Pi can lie in the shaded interval [b,ri). 

Fix an integer m = {n/b)+6 for 5 > 0. If b^S > n, then we use the 0(n^)-time 
algorithm described earlier to compute an optimal cut, so assume that b‘^5 < n. 
If each Bi in a cut c contains at most m points of S, then, for any I < i < 6 , 
the first i buckets in c contain at most mi points and the last b — i buckets in c 
contain at most (b — i)m points, therefore Pi lies in the interval [xi^^Xrf), where 
li = n — m{b — i) and rj = mi. Set rp = 1; see Figured Note that Vi — k = bS 
for 1 < i < b. This implies that the subset Si = {xj \ r^_i < j < k} always 
lies in the ith bucket Bi (see Figure [^, for all 1 < j < 6 . Hence, if there is cut 
f = (L, w) so that all buckets in ^ contain at most m points, then f lies in the 
region P(m) = Hi^i ~ Ij*)? which is the intersection of b cones and 

is thus a convex polygon with at most 2b edges. Next, let Hi f- C he & set of 
li — n = b6 lines defined as Hi = {L + iw = Xj \ k < j < r^}. Set H = Hp, 
\H\ = b‘^5. The same argument as in Lemma [T] shows that no line oi H \ C 
intersects the interior of the polygon Pfm). 

We construct the arrangement A{H) within the polygon P{m) in 0{b'^S^) 
time. (Actually, we can clip A{H) inside P{m) fl Q, where Q is the quadrilateral 
defined in (P.) Let Ap{H) denote this clipped arrangement. By the above dis- 
cussion, Ap{H) is the same as A{L) clipped within P{m). Therefore for any two 
points (L,w) and {L' ,w') in a face / G Ap(H), the contents of all buckets in 
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the cuts {L,w)~ and {L',w') are the same. Let ip{f) = S'), . . . ,<?{,(/, S)). 

If / and /' are two adjacent faces of Ap{H) separated by a line L + iw = xj, 
then the only difference in the two cuts is that Xj lies in i?i_i in one of them and 
it lies in Bi in the other. Therefore ^p{f) can be computed from in 0(1) 
time. By spending 0(n) time at one of the faces of Ap{H) and then traversing 
the planar subdivision Ap{H), we can compute ^(/, S) for all of its faces in 
0(64J2 

+ n) time. If there is any face / for which ^(/, S) < m, we can conclude 
that no < TO. Otherwise, Uo > to. Since no = (n/6) + Z\ < (rg — l)n/r and 
TO = {n/h) + (5 < (ro + l)n/r, we have <5 < Z\ + 2n/r. The total time in per- 
forming the binary search is 0{{b'^{A -f (n/r))^ -I- n) log(n/r)). Hence, the total 
time spent in computing an optimal cut is 

o(r^ + b‘^(A+ -)^og -+nlog-\ 

\ \ r J r r J 



Choosing r 



h^Jn\og^^‘^{n/b'^) 



, we obtain the following. 



Lemma 4. An optimal tight cut for n points into b buckets can be found in 
0{b'^A'^log{n/b‘^) -I- 6^nlog^^^(n/6^)) time. 



Instead of using the quadratic algorithm for computing tq, we can compute 
Co recursively, then we obtain the following recurrence. Let T(m) denote the 
maximum running time of the algorithm for computing an optimal cut for the 
subset of S size to chosen by selecting every (n/TO-)th point of S, then we have 



T{n) 



T{r) + O (A + + n) log = if + ^) < n, 

0{n^) otherwise. 



Choosing r = n/2 and using the fact that Tq < norjn + 1. we can show that 
T{n) = 0(&^Z\^logn -|- n). 



Theorem 1. An optimal tight cut for n points into b buckets can be found in 
0{b'^ A“^ log n + n) time. 



Remark. By choosing the value of r more carefully, we can improve the running 
time to 0{b‘^A'^ log_^ n + n log log A). 

We can use a similar analysis for finding optimal cuts, including relaxed cuts. 
We simply replace n by bn as there are bn event lines. Another way to view this 
is that the optimal cut can be determined by trying out all non-redundant cuts 
for rj buckets for 2 < p < b and selecting the best one. 

Corollary 1. An optimal relaxed cut for a set of n points into b buckets can be 
found in 0{b^A'^ log n + bn) time. 



3 The Uniform-Projection Problem 

In this section we describe the algorithms for the uniform projection problem. 
Let S = {pi, . . . ,pn} be a set of n points in and 1 < & < n an integer. We 
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want to find 6-1-1 equally spaced parallel lines so that all points of S lie between 
the extremal lines, the extreme lines contain at least one points of S, and the 
maximum number of points in a bucket is minimized. If the lines have slope 9, we 
refer to these buckets as a 0-cut of S. We first describe a subquadratic algorithm 
for 6 = 2. Next, we show how the running time of the algorithm by Asano and 
Tokuyama can be improved, and then we describe a Monte Carlo algorithm that 
computes ^(5'), the optimum value, with high probability, in subquadratic time 
for certain values of 6 and A. 

It will be convenient to work in the dual plane. The duality transform maps 
a point p = (a, 6) to the line p* : y = —ax -I- 6 and a line £: y = ax + P to 
the point £* = (a,/3). Let 4 denote the line dual to the point pi G S, and let 
C = {£i \ 1 < i < n}. Let A{C) denote the arrangement of C. We define the level 
of a point p G with respect to £, denoted by A(p, £), to be the number of lines 
in C that lie below p. The level of all points within an edge or a face of A{C) 
have the same level. For an integer 0 < k < n, we define the fc-level of A{C), 
denoted by Ak{C), to be the closure of the set of edges of A{C) whose levels are 
k. Ak{C) is an x-monotone polygonal chain with at most 0{n{k + 1)^/^) edges 
g]. The lower and upper envelopes of A{C) are the levels Ao(£) and M„_i(£). 

Since we require the extreme bucket boundaries to contain a point of S, 
they correspond to points on the upper and lower envelopes of C. For a fixed 
x-coordinate 9, let s{9) denote the vertical segment connecting the points on 
the lower and upper envelopes of C with the x-coordinate 9. We can partition 
s{9) into 6 equal-length subsegments si(0), . . . , Sb{9). Let Pq{9), . . . , Pb{9) be the 
endpoints of these segments. These endpoint are dual of the bucket boundaries 
with slope 9, and Si{9) is the dual of the jth bucket in the 0-cut. The line £j 
intersects Si(0), i < 6, iff the point pj lies in the bucket Bi corresponding to the 
0-cut. Let Pi denote the path traced by the endpoint Pi{9) as we vary 0 from 
— oo to - 1 - 00 . If we vary 0, as long as the endpoints of s(0) do not pass through 
a vertex of upper or lower envelopes, Pi{9), 0 < i < b, trace along line segments. 
Therefore each Pi is an x-monotone polygonal chain with at most 2n vertices; 
see Figure m for an illustration. Since we will be looking at the problem in the 
dual plane from now, we will call Pi’s bucket lines. Let B = {/3o, • . . ,Pb}- The 
intersection of a bucket line with a line (that is dual of point) marks an event 
where the point switches between the corresponding buckets. 

For an x-coordinate 0 and a subset ACC, let pi{A,9) denote the num- 
ber of lines of A that intersect the vertical segment Si(0); pi{A,9) denotes the 
set of points dual to A that lie in the ith bucket in the 0-cut. Let <P{A,9) = 
mB3^i{pLi{A,9)}. Set Uq = ‘£^{S) = mmx<P{S,x). 

3.1 Partitioning into Two Buckets 

We will first describe a deterministic scheme that takes subquadratic time to 
find an optimal solution for partitioning S into two buckets. By our convention, 
Po,P 2 denote the upper and lower envelopes of C, respectively. To determine 
Uo, we will search for an x coordinate Xo, where Pi{xq) is closest to the middle 
level A „/2 of the arrangement A{C). Let Xo be the level of /3i(Xo). We do a 
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Fig. 4. The uniform-projection problem and the bucket lines in the dual setting. 

binary-search to determine Ao in the following manner. We start by looking for 
an intersection between A„/2 and (3\ . If there is one, then we report the optimum 
n/2 split. Otherwise we repeat the procedure for the levels between 0 and n/2 
if / 3 i lies below A„/2i and for the levels between n/2 and n — 1 if /?i lies above 
^n/2- Since the maximum size of a level of n lines is [Ij and it can 

be constructed in an output-sensitive manner, each phase of the search takes 
at most log^’*’^ n) time for any £ > 0. Note that the total number of 

intersections between jSi which has size n and any level of size s is 0(n + s). 

Hence for the case 6 = 2, we have a sub-quadratic algorithm. 

Lemma 5. The optimal uniform projection of n points in into two buckets 
can he computed in log^’*’^ n) steps, for any £ > 0, using 0(n) space. 

3.2 A Deterministic Algorithm 

In this section we present a deterministic algorithm for the uniform-projection 
problem that has 0(6n log n-|- Alogn) running time and uses 0{n + b) storage, 
where K denotes the number of event points, i.e., the number of intersection 
points between C and B. This improves the running times of 0{n^ + bn + K log n) 
for general b and (9(fe0-6i0j.^i.695 _|_ jTlogn) for b < ^/n in pp. 

As in Asano-Tokuyama’s algorithm, we will sweep a vertical line through 
A{C), but unlike their approach we will not stop at every intersection point of 
C and B. We first compute the lower and upper envelopes of £, which are the 
bucket lines /3o and /3b, respectively. We can then compute rest of the bucket lines 
/3i, . . . ,/3b_i in another 0(bn) time. We preprocess each /3i for answering ray- 
shooting queries in 0(n log n) time so that a query can be answered in O(logn) 
time [^. 

For any line i G C, we can compute all Kg intersection points of i with B 
in 0{{Kg + l)logn) time, sorted by their ^-coordinates, using the ray-shooting 
data structure as follows. We traverse £ from left to right, stopping at each 
intersection point of i with B. Suppose we are at an intersection point v of i and 
j3i and £ lies above j3i immediately after v. Then the next intersection point w 
of £ and B, if it exists, lies on either f3i or /3i+i. By querying the ray-shooting 
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data structures for j3i and (3i+\ with the ray emanating from v along we can 
compute w in O(logn) time, if it exists. We can thus compute all intersection 
points of C and B and sort them by their ^-coordinates in 0{{n + K) logn) time. 
After having computed all the event points, we can compute an optimal cut in 
another 0((n -|- K)) time by sweeping from left to right, as in [T]. The space 
required by the algorithm is 0{hn + K). The space can be reduced to 0(n + b), 
without affecting the asymptotic running time, by not computing all the event 
points in advance; see pp and the full version for details. 

Theorem 2. An optimum partitioning in the tight case can be determined in 
0{{bn + AT) logn) time using 0(n) storage, where K is the number of event 
points. 



3.3 A Monte-Carlo Algorithm 

We now present an algorithm that attains sub-quadratic running time for small 
values of b and A, where Uq = (n/b) + A. The overall idea is quite straight- 
forward. From the given set £ of n lines, we choose a random subset R of size 
r > logn (a value that we will determine during the analysis). We compute 
To = mine S), where the minimum is taken over the a:-coordinates of all the 
intersection points of R and B, the set of bucket lines with respect to £. Note 
that we are not computing <?(i?) since we are considering buckets lines with res- 
pect to C. B can be computed in 0(n log n-|-6n) time and Tq can be computed in 
additional 0{r[h + n)) = 0(rn) time. We use rg to estimate the overall optimum 
no with high likelihood. In the next phase, we will use this estimate and the 
ideas used in the one-dimensional algorithm to sweep only those regions of B 
that “potentially” contain the optimal. In our analysis, we will show that the 
number of such event points will be o(n^) if b and A are small. The reader can 
also view this approach as being similar to the randomized selection algorithm 
of Floyd and Rivest. 

We choose two parameter r and Var = Var(r) whose values will be specified 
in the analysis below. An event point with respect to £ (resp. R) is a vertex of 
B or an intersection point of a line of £ (resp. R) with a chain in B. The event 
points with respect to R partition the chains of B into disjoint segments, which 
we refer to as canonical intervals. Before describing the algorithm we state a few 
lemmas, which will be crucial for our algorithm. 

Our first lemma establishes a relation between the event points of A(£) and 
those of A{R). 

Lemma 6. Let a > 0 be a constant and let 1 < i < b be an integer. With 
probability at least 1 — at most 0{n\ogn/r) event points of A{£) lie on 

any canonical interval of Pi . 

In the following, we will assume that ii is a random subset of £ of size 
r > logn. Using ChernofPs bound, we establish a connection between the lines 
of £ and of R intersecting a vertical segment. 
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Lemma 7. Let e he a vertieal segment and let Ce ^ C he the subset of lines that 
interseet e. There is a eonstant c such that with prohahility exceeding 1 — 1/n^, 



n 



\CeC^R\ 



< c 



\h^e\ ^ logn 
n r 



An immediate corollary of the above lemma is the following. 

Corollary 2. There is a constant c so that, with prohahility exceeding 1 — 1/n, 



rig rg 


< c\ 


n r 


V 



llogn 



Corollary 3. Let ^ he a 9-cut so that every bucket of f contains at most m 
points of S. For 1 < i < b — 1, let 

k = r — (b — i)m c\/ r log n and ri = im — h c\/ r logn, 

n n 

where c is a constant. Then with probability > 1 — 1/n, U < \{j3i{9),R) < ri. 

Proof. If each bucket of f at most m points, then the first i-buckets of ^ contain 
at most mi points of S and the last (6 — i) buckets of ^ contain at most (b — i)m 
points of S. The lemma now follows from Lemma 0 LI 

We also need the following result by Matousek on simplex range searching. 

Lemma 8 (Matousek [01 )• Given a set P of points and a parameter m, n < 
m < , one can preprocess P for triangle range searching in time O(mlogn), to 

build a data- structure of 0(m) space and then report queries in 0((nlog^ n)ly/m-\- 
K) time, for output size K , where K is number of points in the query triangle. 

Remark. If m = O(r^logn) and K > (n/r)logn, then the output size domi- 
nates the query time, so the query time becomes 0{K) in this case. 

We now describe the algorithm in detail. Choose a random sample R of 
size r, where r > log n is a parameter to be fixed later, and compute rg = 
ming^(i?, 0), where 9 varies over the x-coordinates of all the event points of 
B with respect to R. We can use a quadratic algorithm to compute rg. By 
Corollary E] no < m = nr^fr P cn-J (log n) jr. Suppose no = (n/6) -|- Z\ and 
m = (njb^Pb. Since no > nro jr — cnsj (log n) /r, we have b < A-|-2n-\/(logn)/r 
and m < {n/b)A-\- (log n) jr. 

We will describe an algorithm that searches over all 6*-cuts for which ^(S', 9) < 
m and computes the value of Ug. For each 1 < i < b, let Xi = {9 \ k < 
\{Pi{9),R) < Vi}. We will describe below how to compute Xi. Let X = fl^ZlXi. 
Next, for each 0 < i < b, we compute the set li of canonical intervals of Pi 
whose x-projections intersect X. Let I = and set n = \X\. We will 

describe below how to compute X. Let Ei denote the set of x-coordinates of 
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event points (with respect to C) lying on li. Set E = Lemma |6] 

\E\ = 0{{vn/r)\ogr). 

By Corollary [3l if ^ is a 0-cut so that all buckets in ^ have at most m points 
of S, then 9 G X. Since the contents of buckets change only at event points, 
<P{C) = minggs 6). We thus have to compute 0) for all 9 G E. 

We preprocess S in 0(r^ log^ n) time into a data structure of size 0(r^ logn) 
for answering triangle range queries using Lemma |8] For each canonical interval 
I Gli, we compute the subset £/ C £ of lines that intersect I in 0{{n/r) logr) 
time. We then compute the intersection points of / and Ci — these are the event 
points with respect to C that lie on I. We repeat this step for all intervals in 
X. The total time spent in computing these intersection points is 0{r^ log^ n + 
v(njr) log r). We now have all points in E at our disposal. For 1 < i < b, 0), 
for 9 G E, changes only if 0 £ ifi-i U Ei, i.e., when a line of £ intersects the 
bucket line /3i_i or (3i. We compute ^i{C,9), for 0 G E^_i UEi, as follows. If 0 is 
the x-coordinate of the left endpoint of a canonical interval in Ii_i UXi, we use 
the range-searching data structure to compute in 0{{n/r) log r) time the number 
kg of lines in £ that intersect the segment Si(0); Hi{C,9) = kg. Otherwise, if 0 
lies on a canonical interval I and if 9' is the x-coordinate of the event point 
lying before 0 on /, then we compute fj.i{C,0) from fii{C,9') in 0(1) time. The 
total time spent in computing /x£s is 0((|li_i| -I- \Ii\){n/r) logr). Summing over 
1 < i < 6, we spent a total of 0{(yn/r) logr) time. By our construction, we now 
have ^i{£,9) all 9 G E. By scanning all these /r£s once more, we can compute 
mingg^; <?(£, 0) in another 0(;/(n/r) logr) time. 

It thus suffices to describe how to compute Xi. Set k = r — {b — i)mrjn — 
C\/ r log n and r, = imr /n + C\/r log n. Define 

a = Ti — li = bm r + 2c\/r logn < bA — h 4c\/ r logn. 

n n 

Since li < \{Pi{9),R) < r^ for all 0 £ Xi, we compute the levels Aj{R), k < j < 
ri. Let Mi be the resulting planar subdivision induced by these levels. By a result 
of Dey [1], \Mi\ = 0(r'^/^(ri — Z^)^/^) = M can be computed in time 

0(|M| log^’*’^ n). Since (3i is an x-monotone polygonal chain and M consists of 
a edge-disjoint x-monotone polygonal chains, the number of intersection points 
between f3i and Mi is 0{na + \M\), and they can be computed within that time 
bound. We can thus compute the set I' of all canonical intervals of (3i whose 
x-projections intersect Xi in time 0{na + r). Let Xi be the x- 

projection of the portion of (3i that lies between Ai.{R) and Ar^{R). Xi consists 
of 0(n -I- r^/^) intervals. We set X = Xi, |X| = 0{h{n + Next 

we discard those canonical intervals of X[ whose x-projections do not intersect 
X. The remaining intervals of X[ gives the set Xi. Therefore v = \X[\ = 

0{b{na + r"*’/^cr^/^)). Repeating this procedure for 0 < i < 6, the total time in 
computing X is 0{b{na + log^’’"'^ r)). The total time in computing Uo is 

thus 

0{n\ogn + bn) + 0(rn) + 0{r^ logr) -|- 0{b{na + ■ (n/r) logr. 

Setting r = [ {bn)"^/^ log n] and using the fact that 6 < n, we obtain the following. 
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Theorem 3. There is a Monte Carlo algorithm to compute the optimal uni- 
form projection of a set of n points in onto b equal-sized buckets in time 
0(min{6^/^n^/^logn + (6^Z\)nlogn, n^}), using 0{(bn)n^^^ log^ n) space, with 
probability at least 1 — 1/n, where the optimal value is (n/b) + A. In particular, 
our algorithm can detect and report if there is a uniform projection (i.e., with 
A = 0) in logn) time. 

4 Two-Dimensional Partitioning 

Using the algorithm developed for the uniform projection, we can obtain a two- 
dimensional partitioning of a point set in the plane similar to the one-dimensional 
partitioning. More specifically, we overlay the plane with two orthogonal families 
of Vb -\- 1 equally spaced parallel lines. These lines generate b rectangles. Our 
previous strategy can be used to obtain an efficient algorithm for the problem of 
obtaining a partitioning that minimizes the maximum bucket content with the 
restriction that the buckets lie between the extreme points. 

As noted by Asano and Tokuyama, the dual plane parameterization gives 
a direct solution to the problem by sweeping the arrangement by two vertical 
lines that are a fixed distance apart (this corresponds to the fixed angle in the 
primal plane) . Using our approach, we obtain a result analogous to Theorem E] 
but has a time-bound that is a multiplicative b factor more. This is because for a 
candidate-interval corresponding to one sweep line, there could be b rectangles. 
For optima bounded by n/6^ -|- A, the candidate intervals correspond to the 
bucket lines lying within b{Ar /n-\- 0{y/r log n)) levels of A{R). Omitting all the 
details, we conclude the following. 

Theorem 4. Given a set of n points in the plane and an integer b, there exists 
a Monte-Carlo algorithm to find an optimal two-dimensional partition in time 
0(min{6®/^n^/^logn-|-(&^Z\)nlogn, n^}), with probability at least 1 — 1/n, where 
the optimal value is n/b'^ -\- A. 
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Abstract. This paper considers reconhgurations of polygons, where 
each polygon edge is a rigid link, no two of which can cross during the 
motion. We prove that one can reconfigure any monotone polygon into 
a convex polygon; a polygon is monotone if any vertical line intersects 
the interior at a (possibly empty) interval. Our algorithm computes in 
O(n^) time a sequence of O(n^) moves, each of which rotates just four 
joints at once. 



1 Introduction 

An interesting area in computational geometry is the reconfiguration of (planar) 
linkages: collections of line segments in the plane (called links) joined at their 
ends to form a particular graph. A reconfiguration is a continuous motion of 
the linkage, or equivalently a continuous motion of the joints, that preserves the 
length of each link. We further enforce that links do not cross, that is, do not 
intersect during the motion. For a survey of work on linkages where crossing is 
allowed, see the paper by Whitesides jO]. 

The case of noncrossing links has had a recent surge of interest. The most 
fundamental question is still open: Can every chain be reconfigured into any 
other chain with the same sequence of link lengths? Here a chain is a linkage 
whose underlying graph is a path. Because reconfigurations are reversible, an 
equivalent formulation of the question is this: Can every chain be straightened, 
that is, reconfigured so that the angle between any two successive links is tt? This 
question has been posed independently by several researchers, including Joseph 
Mitchell, and William Lenhart and Sue Whitesides [6]. It has several applicati- 
ons, including hydraulic tube and wire bending, and sheet metal folding [Zj. 

At first glance, it seems intuitive that any chain can be “unraveled” into 
a straight line, but experimentation reveals that this is a nontrivial problem. 

* Research performed during a post-doctoral position at McGill University. 
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Indeed, no such “general unraveling” motions have been formally specified. Be- 
cause the problem is so elusive, it is natural to look at special classes of linkages 
and prove that at least they can be straightened. For example, consider the class 
of monotone chains, where every vertical line intersects the chain at a point or 
not at all. Such a chain can easily be straightened by repeatedly (1) rotating the 
first link until it lines up with the second link, and (2) fusing these links together 
into a single “first link.” This motion induces no crossings because it preserves 
monotonicity throughout. 

This paper addresses the analogous question of straightenability for polygons 
(a linkage whose underlying graph is a cycle): Can every polygon be reconfigured 
into a convex polygon? In other words, can every polygon be convexifiedl This 
question was also raised by the researchers mentioned above. Note that a con- 
vex polygon can be reconfigured into any other convex polygon with the same 
clockwise sequence of link lengths, and hence the question is equivalent to the 
fundamental question for polygons |Z]: Can every polygon be reconfigured into 
any other polygon with the same clockwise sequence of link lengths? 

In this paper we focus on the case of 
monotone polygons. Similar to the case of 
chains, a polygon is monotone if the inters- 
ection of every vertical line with the interior 
of the polygon is an interval, that is, either 
a single vertical line segment, a point, or the 
empty set. See Fig. [T| for an example. A mo- 
notone polygon consists of two chains, the 
upper chain and the lower chain. Each chain is weakly monotone in the sense 
that the intersection with a vertical line is either empty, a single point, or a ver- 
tical edgeQ The left [ right] ends of the upper and lower chains may be identical 
(like point A in Fig. [T]), or they may be connected by a vertical edge (like edge 
{B, C) in Fig. [T|). The vertical edge {B, C) belongs to neither chain. 

In contrast to monotone chains, it is nontrivial to convexify monotone po- 
lygons. In this paper, we show that this is possible by a fairly simple motion 
consisting of a sequence of O(n^) moves. We use just a single type of move, 
changing the angles of only four joints, which we show is the fewest possible. 
While the proof of correctness is nontrivial, our algorithm for computing the 
motion is simple and efficient, taking O(n^) time. 




Fig. 1. A monotone polygon. 



1.1 Related Work 

Let us briefly survey the work on reconfiguring linkages whose links are not 
allowed to cross. 

The most related result, by Bose, Lenhart, and Liotta [3], is that all monotone- 
separable polygons can be convexified. A monotone-separable polygon is a mo- 
notone polygon whose upper and lower chains are separated by a line segment 
(connecting the common ends of the chains). Their motion involves translating 



^ All straight (angle- tt) vertices are removed, so there is no possibility of two adjacent 
vertical edges. 
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almost all joints in the upper chain at once, and appears not to extend to general 
monotone polygons. 

The only other result about convexifying classes of polygons is that every 
star-shaped polygon can be convexified [H|. A polygon is star-shaped if its bo- 
undary is entirely visible from a single point. This motion rotates all joints 
simultaneously, and it seems difficult to find a motion involving few joints | 10| . 

The remaining related results are for types of linkages other than polygons. 
For tree linkages, it is known that the answer to the fundamental problem is 
“no” |2]: There are some trees which cannot be reconfigured into other trees with 
the same link lengths and planar embedding. Indeed, there can be an exponential 
number of trees with the same link lengths and planar embedding that are 
pairwise unreachable. The complexity of determining whether a tree can be 
reconfigured into another remains open. 

Another way to change the problem is to allow linkages in higher dimensions. 
If we start with a polygon in the plane, and allow motions in three dimensions, 
then every polygon can be convexified jS]. Indeed, a 1935 problem by Erdos 
asks whether a particular sequence of moves through 3D, each rotating only two 
joints (called a “flip”), converges in finite time. While the answer is positive, the 
number of moves is unfortunately unbounded in n. Recently, it was shown that 
0{n) moves of a different kind suffice [T|; each rotates at most four joints. 

If the polygon lies in three dimensions and we want to convexify it by mo- 
tion through three dimensions, then it is surely not convexifiable if it is knotted. 
But there are unknotted polygons that cannot be convexified [I]. The comple- 
xity of determining whether a polygon in 3D can be convexified also remains 
open. Amazingly, Cocan and O’Rourke [4j have shown that every polygon in d 
dimensions can be convexified through d dimensions for any d > 4. 



1.2 Outline 

The rest of this paper is organized as follows. Section [2] begins with a more formal 
description of the problem. Section El describes our algorithm for computing 
the motion. Sections HI and O prove its correctness and bound its performance, 
respectively. We conclude in Section 0 



2 Definitions 

This section gives more formal definitions of the concepts considered in this 
paper: linkages, configurations, and motions. 

Consider a graph, each edge labeled with a positive number. Such a graph 
may be thought of as a collection of distance constraints between pairs of points 
in a Euclidean space. A realization of such a graph maps each vertex to a point, 
also called a joint, and maps each edge to the closed line segment, called a 
link, connecting its incident joints. The link length must equal the label of the 
underlying graph edge. If a graph has one or more such realizations, we call it a 
linkage. 
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An embedding of a linkage in space is called a configuration of the linkage. 
In a simple configuration, any pair of links intersect only at a common endpoint, 
and in this case the links must be incident at this joint in the linkage. We consider 
only simple configurations in this paper. A motion of a linkage is a continuous 
movement of its joints respecting the link lengths such that the configuration of 
the linkage remains simple at all times. 

In this paper, we consider linkages embedded in the plane whose graph is 
a single cycle. The configurations are simple polygons, and so divide the plane 
into the exterior and interior regions (distinguished by the fact that the interior 
region is bounded). Joints of an n-link linkage are labeled jo,ji, ■ ■ ■ ,jn-i in a 
counterclockwise manner: traversing the boundary in sequence jo, ji, . . . , jn-i 
keeps the interior region on the left. The joint angle 9i is the interior angle at 
joint ji'. 0i = ^ji+ijiji-i G (0, 27t). We call a joint straight if the joint angle 
is 7T, convex if the joint angle is strictly less than tt, and reflex otherwise. A 
configuration is convex if none of its joints are reflex. 

The question considered is whether every monotone configuration of a cyclic 
linkage (or polygon) can be convexified, that is, moved to a convex configuration. 
We show this is true by giving an algorithm to compute such a motion. 



3 Algorithm 

As input, the algorithm requires a description (that is, the joint coordinates) of 
a polygon with n links. In each main step, the algorithm computes a sequence of 
0{n) moves that ultimately straighten a joint. This joint angle is then held fixed 
so that it remains straight forever after, effectively reducing the number of joints. 
As the algorithm continues, the configuration has fewer and fewer nonstraight 
joints. We stop when no reflex joint is left, and so the polygon is convex as 
desired. 

First we need some notation for basic geometric concepts. Let p and q be two 
points in the plane. Define ||p< 7 || to be the Euclidean distance between p and q. 
If p and q are distinct, define ray(p, q) to be the ray originating at p and passing 
through point q. We use “left” and “right” in two different senses, one for points 
and one for rays. A point r is left or right of ray(p, q) if it is strictly left or right 
(respectively) of the oriented line supporting ray(p, g). A point p is left or right 
of point q ii p has a strictly smaller or larger x coordinate than q, respectively. 
In both cases, we use nonstrictly left/right to denote left/right or equality, i.e., 
neither left nor right. 

The algorithm works as follows: 

Algorithm Convexify 

— Until the polygon is convex: 

— Find a rightmost reflex joint (a joint with maximum x coordinate, bre- 
aking ties arbitrarily). 

— Relabel joints counterclockwise along the polygon so that this rightmost 
reflex joint is ji, and all straight joints are ignored. 

— If ji belongs to the lower chain: 
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1 . 

2 . 



Compute the largest index p such that j 2 , ■ ■ ■ ,jp-i are right of ray 
(joJi)- 

Until a joint has straightened: 

a) Perform the following move (see Fig. 2) until either the joints jo, 
ji , and jp- 1 become collinear; or one of the joints {jo , ji, jp-i, jp} 
straightens: 



i. Fix the positions of joints jp, 
jp+ij ■ - - ; jn—ij and jo- 

ii. Fix all joint angles except 
those at jo, ji, jp-i, and jp. 

iii. Rotate ji clockwise about jp. 

iv. Move joint jp-i as uniquely 
defined by maintaining the di- 
stances ||jijp-i|| and ||jp_ijp||. 

V. Move joint ji, 2 < i < p — 2, 
as uniquely defined by maintai- 
ning the distances ||jiji|| and 
l|jUp-i||- 




joints ji and jp-i. The thick 
dashed lines represent “virtual 
links” whose lengths are 
preserved. 



b) Update the coordinates of ji and jp-i- 

c) If jo, ji, and jp_i are collinear, then decrement p, because with 
the new positions, jp-i is on ray(jp, ji). Also update the coordi- 
nates of the new jp- 1 . 

3. Update the coordinates of any remaining joints that have moved. 

— If ji belongs to the upper chain, the algorithm is similar. 



First let us justify that the algorithm is well-defined. 



Lemma 1. The definition of p in Step\7\is well-defined and at least 3. 



Proof. Because ji is reflex, ray(jp, ji) must intersect the polygon elsewhere than 
the segment (jp, ji). This implies that there are joints on both sides of the ray, 
and hence p is well-defined. Furthermore, because ji is reflex and the joints are 
oriented counterclockwise on the polygon, j 2 is right of ray(jp, ji). Hence, 2 is a 
valid value for p — 1, so p > 3. □ 



Note further that the number of simultaneously rotating joints (four) is the 
best possible, because any motion of a polygon that rotates just three joints 
reconfigures a virtual triangle, which is rigid. 



4 Proof of Correctness 

In this section, we prove the following theorem. 

Theorem 1. Given any monotone polygon, Algorithm Convexify computes a 
convexifying motion, during which the polygon remains simple and monotone. 

The difficulty is in showing that the polygon remains monotone and simple 
during the motion. For the remainder of this section, assume without loss of 
generality that the link (jp, ji) is on the lower chain. First we need some trivial 
but important observations. 
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Lemma 2. During each move, no reflex angle becomes convex and no convex 
angle becomes reflex. 

Proof. Because all other joint angles are fixed, any such transition means that 
a joint jo, ji, jp-i, or jp straightens, which stops the move. □ 

Lemma 3. Joints j 2 , . . . , jp-i are convex. 

Proof. Consider such a joint ji. Because ji is a rightmost refiex vertex, ji is 
convex if it is right of j±. By monotonicity and the property that ji is right of 
ray(jo, ji), ji cannot be left of ji. The only case that remains is when ji has the 
same x coordinate as ji. Because we ignore straight joints, ji must in fact be j 2 
in this case. Because ji is refiex, j 2 must be below ji. Now ja must be strictly 
right of j 2 , and hence the angle at j 2 is convex. 

Next we need a general result about quadrangles. 

Lemma 4. [T] Consider a simple quadrangle 

jo, ji, J 2 , J 3 (in counterclockwise order) with ji reflex. 

(See Fig.\^) Let 9i denote the interior angle at joint 
ji. If the linkage moves so that ji rotates clockwise 
about jo, then 9\ decreases and 9i increases for all 
i G {0, 2, 3}, until 9i straightens. In other words, all 
of the angles approach tt. 

By the definition of p, ji is refiex in the quadrangle Q = (jo, ji, jp-i, jp), so 
we can apply Lemma 2] to Q and obtain the following result about our motion: 

Lemma 5. During each move in Step lital jp-i rotates counterclockwise about 

jp, and the joint angles 9\ and 9p-\ both approach tt. 

Next we analyze the movement of ji relative to jp- I’s reference frame, deter- 
mined by fixing the position of jp-i and keeping the axes parallel to the world 
frame’s. This can be visualized by imagining that during the motion we translate 
the entire linkage so that jp_i stays in its original position. 

Lemma 6. ji rotates counterclockwise about jp-\. 

Proof. Consider the relative movement of ji and jp about jp_i. Joint jp is ro- 
tating counterclockwise about jp_i and the angle /jijp-ijp is increasing by 
LemmaEl Hence, ji must also be rotating counterclockwise about jp_i. □ 

We are now in the position to prove that the polygon remains simple and 
monotone throughout the motion. 

Proof (Theorem]]}) . The only way that simplicity or monotonicity can be viola- 
ted is that either a link intersects another link, or a vertical link rotates in the 
“wrong” direction. The wrong direction for link (j^, j^+i) on the lower [upper] 
chain is when j^+i becomes left [right] of ji. Consider the first time at which a 
link intersects another, or a vertical link rotates in the wrong direction. 

Suppose first that a vertical link (ji,ji+i) rotates in the wrong direction. 
Because only joints ji, . . . , jp-i move, we must have 0 < i < p—1. We distinguish 
three cases: 




Lemma [3 
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Case 1: Link (jo,ji) is vertical 

Because j\ is reflex and is nonstrictly right of ji, j\ must be above jo- 
But ji rotates clockwise about jo, so monotonicity is preserved. 

Case 2: Link (ji, j 2 ) is vertical 

Because ji is reflex, ji is above j 2 - By LemmaEl ji rotates counterclockwise 
about jp-i - Hence, the rigid triangle jp-ijij 2 rotates counterclockwise about 
jp-i- Thus because the link (ji, j 2 ) is vertical, ji is above j 2 , and jp_i is 
(nonstrictly) right of ji, ji moves left of j 2 - Thus, monotonicity is preserved. 
Case 3: Link (ji, ji+i) is vertical, 2 < * < p — 1 

We show that joints ji and ji+i are both convex. First, if i < p—2, then this 
follows by Lemma 12] Second, if i = p — 1, then jp_i and jp must be right of 
ji in order for them to be on opposite sides of ray(jo, ji). Hence, jp_i and 
jp must be convex because they are right of ji which is a rightmost reflex 
vertex. 

Thus, joints ji_i and ji +2 are both left of the link (ji, ji+i); that is, (ji-i, ji) 
belongs to the lower chain and (ji+i, ji+ 2 ) belongs to the upper chain. This 
means that link (ji, ji+i) joins the top chain to the bottom chain (like link 
HC in Fig.[T|). The polygon remains monotone no matter which way the link 
moves. 

Now suppose that two links intersect each other, but the polygon remained 
simple and monotone before this time. By Lemma O the joint angles 9i and 
0p_i approach tt, and hence the chain of moving links (jo, . . . , jp) cannot self- 
intersect. Hence, the only concern is whether any of these links could intersect 
the rest of the polygon. 

In the following, refer to Fig.|4| Let u denote the original position of ray(jo, ji), 
and V denote the downward vertical ray emanating from jo- Let W be the wedge 
right of ray u and left of ray v. 

Joint ji starts on the boundary of VF; because it rotates clockwise about jo, it 
enters region W at the start of the move. By the choice of p, joints j 2 , • • • , jp-i 
all lie to the right of u. These joints must also lie to the left of v, because 
otherwise the polygon would not be monotone. Thus, after the move starts, the 
chain ji, . . . , jp-i lies inside W. 

We now argue that the only joints that can be inside W are ji,...,jp_i. 
We know that jo and jp are not interior to W. If some other joint lies inside 
W, the chain must cross one of the boundaries of W. The chain cannot cross 
ray v, because that would violate monotonicity. Nor can the chain cross ray u, 
because that would require a reflex vertex to the right of ji or a violation of 
monotonicity. Because none of jp, . . . , j„_i, jo are moving, this chain remains 
outside W during the motion. 

To establish that the polygon remains simple, we claim further that the joints 
ji, . . . , jp_i never leave W. Suppose to the contrary that one does. Let ji be the 
first such joint to leave W. It must reach either ray u or ray v. Consider each 
possibility in turn. 

ji crosses v: Because ji stays reflex (unless it straightens which stops the move), 
it cannot be the first to cross v. If ji crosses u for I < j < p, then we have 
both jo and ji left of ji, so the chain is not monotone, a contradiction. 
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Fig. 4. Definition of v, v', W, and P. (Left) When {jp,jp+i) is not on the lower chain. 
(Right) When {jp,jp+i) is on the lower chain. 

ji crosses u: Because ji rotates clockwise about jo, it never crosses u. If j 2 
crosses, ji cannot be reflex, a contradiction. If jp-i crosses, the move must 
have already stopped from jo, ji, and jp_i becoming collinear because ji 
rotates clockwise about jo- Finally, if ji crosses u for 2 < i < p — 1, ji must 
be reflex because ji-i and ji+i are both right of u, contradicting Lemma E] 
Hence, the links of the chain jo, . . . , jp-i always remain inside W, so they cannot 
intersect links outside of W. 

This leaves just one moving link, (jp_i, jp). By LemmaEl jp-i rotates coun- 
terclockwise about jp. Define v' to be the vertical ray emanating from jp that 
points away from the interior of the polygon, preferring upward if both directions 
are possible. Thus, v' points downward [upward] when the link (jp, jp+i) is [not] 
on the lower chain. 

Let P be the pie wedge bounded by the link (jp_i, jp), the ray v', and the 
counterclockwise circular arc, centered at jp, starting at jp_i and ending on v' (at 
distance jjjpjp_ijj from jp). Because monotonicity is preserved up to this point, 
P is empty of nonmoving links. P also contains the entire sweep of (jp_i, jp): if 
v' points upward, jp-i cannot cross v' without first becoming collinear with jo 
and ji; if v' points downward, it cannot cross without first straightening jp. □ 

5 Time and Move Bounds 

Finally we establish the time and move bounds on the algorithm. Our model 
of computation is a real random-access machine supporting comparisons, basic 
arithmetic, and square roots. 

Lemma 7. Each iteration of Step i takes constant time. 

Proof. The computations in this step are computing which of the five candi- 
date events for stopping the motion occurs first (Step 2a), and updating 0(1) 
coordinate positions (Steps 2b and 2c). 
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In order to compute the halting event, we 
define the following (refer to Fig.E]). Here x* 
denotes that x changes during the move. 

a=l|joji||, & = ||jijp-i||, c= ||jp_ijp||, 
d=\\jpjo\\, rn* = \\jijp\\, 

^0 = ^*1 = IjOjlfp-l, 

Note that m* increases during the move, be- 
cause j* rotates clockwise about jo, increa- 
sing (/?g and thus (by law of cosines) to*. 

Hence, we parameterize the move by the dia- 
gonal distance to*. More precisely, we deter- 
mine the halting event by computing the value of (to*)^ for each of the five 
candidate events, and choosing the event with smallest (to*)^. 

Now the event that ji straightens happens when its angle 6* is tt , and the 
event that jo, ji, and jp_i become collinear happens when tp\ = tt. Note howe- 
ver that j* can only straighten after jo, j*, and j*_i become collinear (because 
Zjo, Ji, jp_i is initially reflex), and hence we do not need to consider this event. 
Using the law of cosines, we obtain the following solutions to these event equa- 
tions: 

jo straightens: (to*)^ = + (P + 2adcos{ljn-ijojp) and /-jn-ijojp < tt. 

j*_i straightens: ( to *)^ = 6^ -I- -I- 26ccos(/jp_2jp-iji). 
jp straightens: Provided /-jojpjp+i < (which is necessary for this event), 
(to*)^ is the solution of a quadratic polynomial involving a, b, c, d, and 
cos(/jojpjp-i-i)- This reduces to an arithmetic expression involving square 
roots. 

jo,j*,jp-i collinear: (to*)^ = {ac^ + bd'^ — ab{a + b))/{a + b), because cosaj = 
— cos PI . 

The new joint coordinates can be computed as follows: Suppose that we know 
the new coordinates of joints ji_i and ji (initially i = 0). Let v be the vector 
from ji to ji-i, rescaled to have the known length ||jzji+i||. Rotate this vector 
clockwise by the new 9i, which only involves cosines and sines of this angle, and 
then add it to the point ji. The result is the new position of joint j*_^_i. We can 
similarly compute the new coordinates o f Jp _i from the coordinates of jp and 
jp+i- Thus, each update of jj, j*_^ (StepUb]), and possibly j *_2 (Step[2c) takes 
constant time as desired. □ 

Theorem 2. Algorithm Convexify computes 0(jP) moves in O(n^) time. 

Proof. By Lemma (3 any one move takes 0(1) time to compute. Any execution 
of Step El therefore takes 0(n) time, because initially p < n—1, and p decreases 
with every move, until a joint straightens and Step[^ terminates. All other steps 
can also be implemented in 0{n) time. One iteration of the main loop thus 
takes 0(n) time. Because each iteration straightens a joint, the polygon becomes 
convex after 0(n) iterations. Hence, there are O(n^) moves, which are computed 
in O(n^) time. □ 




Fig. 5. Illustration of the proof of 
Lemma [3 



424 



T.C. Biedl et al. 



6 Conclusion 

We have presented an 0(n^)-time algorithm to compute a sequence of 0{v?) 
moves, each rotating the minimum possible number of four joints at once, that 
reconfigures a given monotone polygon into a convex polygon with the same link 
lengths. By running the algorithm twice we can find a motion between any two 
monotone polygons with the same clockwise sequence of link lengths. 

Several interesting open problems remain. Can our algorithm be improved to 
use o(n^) moves each rotating o(n) joints? More generally, what is the tradeoff 
between the number of simultaneously rotated joints and the number of moves? 

Our result adds to the class of polygons that are known to be convexifia- 
ble; previously, the only nontrivial classes were star-shaped polygons [5] and 
monotone-separable polygons [^. A natural area of research is to explore more 
general classes of polygons. Is there a convexifiable class containing both mono- 
tone and star-shaped polygons (other than trivial classes like the union)? 
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Abstract. Given two subsets Ti and T 2 of vertices in a 3-connected 
graph G = (V,E), where |Ti| and |T 2 | are even numbers, we show that 
V can be partitioned into two sets Vi and V 2 such that the graphs induced 
by Vi and V 2 are both connected and |Vi nT^j = IV 2 nTj| = |Tjj/2 holds 
for each j — 1,2. Such a partition can be found in 0(|Up) time. Our 
proof relies on geometric arguments. We dehne a new type of ‘convex 
embedding’ of fc-connected graphs into real space and prove that 

for fc = 3 such embedding always exists. 



1 Introduction 

We define the following graph-partitioning problem: Given an undirected graph 
G = (U, E) and k subsets Ti, . . . , of U, not necessarily disjoint, find a partition 
V into I subsets such that G[Vj] {I < i < 1) are all connected and 

dij < fi < bij holds for each pair i,j, where aij and bij are prespecified 
lower and upper bounds. In this problem, we interpret each as a set of vertices 
which possess ‘resource’ i. In particular, one may ask for a partition where the 
vertices in each Tj {1 < j < k) are distributed among the subsets W, . . . , Vj as 
equally as possible. 

In this paper, we consider the case of ^ = 2 and ask to distribute the resources 
equally: Given an undirected graph G = (V,E) and k subsets Ti, T 2 , . . . , of 
U, where \Tj\ is even for all j, find a bipartition {Vi,V 2 } of V such that the 
graph G\Vi] induced by each Vi is connected and |Ui G\Tj \ = \V 2 C\Tj\ (= \Tj\/2) 
holds for j = 1, . . . , fc. Let us call this problem the k-hisection problem. We prove 
that for every 3-connected graph G and for every choice of resources Ti and T 2 , 
the 2-bisection problem has a solution. 

To verify this, we reduce it to a geometrical problem. Our method is outlined 
as follows. We first prove that every 3-connected graph G can be embedded in the 
plane in such a way that the convex hull of its vertices, which is a convex polygon, 
corresponds to a cycle G of G, and every vertex v hiV —V (G) is in the convex 

* This research was partially supported by the Scientific Grant-in-Aid from Ministry 
of Education, Science, Sports and Gulture of Japan, and the subsidy from the In- 
amori Foundation. Part of this work was done while the second author visited the 
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hull of its neighbors (precise definition is given in Section 4) . This will guarantee 
that, for any given straight line L in the plane, each of the two subgraphs of 
G separated by L remains connected. Given such an embedding, we apply the 
‘ham-sandwich cut’ algorithm, which is well known in computational geometry, 
to find a straight line L* that bisects the two subsets Ti and T 2 simultaneously. 
Since the above embedding ensures that the two subgraphs separated by the 
ham-sandwich cut L* are connected, this bipartition of the vertices becomes a 
solution to the 2-bisection problem. 

We give an algorithm which finds such a bisection in 0(|Fp) time. 



1.1 Related Results 

If ^ = 2 and there is just one set T of resources, the problem is NP-hard for 
general graphs, since it is NP-hard to test whether a given graph G = (V, E) 
and an integer ni < |P| have a partition of V into two subsets Vi and V 2 such 
that the graph induced by Vi is connected for i = 1,2, and |Vi| = rii holds [2]. 
When G is 2-connected, it is known that such a partition {Vi, V 2 } always exists 
and it can be found in linear time unsi. More generally, the following result 
was shown independently by Gyori and Lovasz [S]. 

Theorem 1.1 m Let G = (y, E) be an ^connected graph, wi,W 2 , ■ ■ ■ ,we € V 
be different vertices and ni,U 2 , ■ ■ ■ ,ni be positive integers such that ni + U 2 + 
. . , + n(, = \V\. Then there exists a partition {Vi,V 2 , ■ ■ ■ ,Vi} ofV such that G[Vi] 
is connected, \Vi\ = Ui and Wi € Vi for i = 1, . . . ,i. □ 



2 Preliminaries 

Let G = {V, E) stand for an undirected graph with a set V of vertices and a set 
E of edges, where we denote \V\ by n, \E\ by m. For a subgraph Ed of G, the sets 
of vertices and edges in IT are denoted by V{H) and E{H), respectively. Let X 
be a subset of V. The subgraph of G induced by X is denoted by G[X]. A vertex 
G y — A is called a neighbor of X if it is adjacent to some vertex u € X, and 
the set of all neighbors of X is denoted by EoiX). Let e = (u,v) be an edge 
with end vertices u and v. We denote hy Gfe the graph obtained from G by 
contracting u and v into a single vertex (deleting any resulting self-loop), and by 
G — e the graph obtained from G by removing e. Subdividing an edge e = {u, v) 
means that we replace e by a path P from u to v where the inner vertices of 
P are new vertices of the graph. If we obtain a graph G' by subdividing some 
edges in G, then the resulting graph is called a subdivision of G. A graph G is 
k-connected if and only if \V\ > k + 1 and the graph G — X obtained from G by 
removing any set A of (fc — 1) vertices remains connected. A singleton set {x} 
may be simply written as x. 
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2.1 The Ham-Sandwich Theorem 

Consider the d-dimensional space R'^. For a non-zero a G R'^ and a real b G R^, 
id (a, b) = {x G R'^ I (a ■ x) = b} is called a hyperplane, where (a • x) denotes the 
inner product of a,a; G R'^. Moreover, H^(a,b) = {x G R”^ | {a - x) >b} (resp., 

H~{a,b) = {x G R^^ I (a • x) < 6}) is called a positive closed half space (resp., 

negative closed half space) with respect to H. 

Let Pi, . . . ,Pfi be d sets of points in R'*. We say that a hyperplane H = 

H{a,b) in R'^ bisects Pi if |id“*'(a, 5) (1 Pi\ < and |id“(a, 6) nPi| < 

Thus, if I Pi I is odd, then any bisector P[ of Pi contains at least one point of P^. 
If H bisects Pi for all i = 1, . . . ,d, then H is called a ham-sandwich cut with 
respect to the sets P\, . . . ,Pd- The following theorem is well-known. 

Theorem 2.1 Given d sets Pi, . . . ,Pd of points in the d-dimensional space 
R^^, there exists a hyperplane which is a ham-sandwich cut with respect to Pi, ... , 

Pd- □ 

3 Bisecting k Subsets in Graphs 

Let G = (V,E) be a graph and Ti,T 2 , . . . ,Tk be subsets of V, where \Tj\ is even 
for j = 1, . . . , fc. A bipartition {Mi, V 2 } of M is a k-bisection if G[M], i = 1,2, are 
connected and |Vi fl Pj| = IV 2 FI Tj\ (= \Tj\/2) holds for j = 1,2,. .. ,k. 

One can observe that the fc-connectivity of G is not sufficient for the existence 
of a fc-bisection. For fc = 1 let G = {V, E) be the complete bipartite graph df 2 f-i,i 
with i>2 and let Ti = V. Clearly, there exists no 1-bisection. For k >2 consider 
the following example. Let K 2 k-i,k = (W U Z, E) be a complete bipartite graph 
with vertex sets W = {wi,W 2 , . ■ . ,W 2 k-i} and Z = {zi, Z 2 , . . . , Zk}, and let 
Ti = W U {zi} for i = 1, . . . , fc. Note that |Ti| = 2k holds for all i = 1, . . . , fc. 
Suppose that (Vi, V 2 } is a fc-bisection to Kk^ 2 k-i and (Ti, . . . , Tk}. The set Ti is 
bisected by (Mi, M 2 }; MiOTi = {wi,. . .,Wk} and M 2 nPi = (wfe+i, . . .,W 2 k-i,zi} 
can be assumed without loss of generality. Since MUTTi = jwi, . . . , Wk} also holds 
for each i = 2, . . . ,k, Vi cannot contain any vertex in Z and hence Mi must be 
{wi, . . . , Wk}. However the induced subgraph G[Mi] is not connected. Thus these 
graphs admit no fc-bisection. 

On the other hand, we propose the following conjecture. 

Conjecture 3.1 Let G = {V,E) be a (k-\-l)- connected graph and Ti,T 2 , . . . ,Tk 
be pairwise disjoint subsets of V , where \Tj\ is even for j = l,...,k. Then G 
has a k-bisection. 

Conjecture 13.11 is true for fc = 1 by using the so-called ‘st- numbering’ of 
vertices, as observed in [13j . 

Corollary 3.2 Let G = (V,E) be a 2-connected graph and T be a subset of 
V with even |T|. Then G has a 1-bisection (Mi, M 2 }, and such {Mi,V 2 } can be 
computed in 0{m) time. □ 
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The main result of this paper is an algorithmic proof for Conjecture 13.11 in 
the case of fc = 2 (in a stronger form where the sets Ti , T 2 of resources need not 
be disjoint). Our algorithm finds a 2-bisection in a 3-connected graph in 0{n^) 
time. 

4 Strictly Convex Embeddings 

In this section, we introduce a new way of embedding a graph G in The 
existence of such an embedding (along with the ham-sandwich cut theorem) will 
play a crucial role in the proof of Conjecture 13. II for fc = 2 in Section 5. 

For a set P = {xi, . . . ,Xk} of points in a point x' = oiXi -I- 023^2 + 

• • • -I- ttfcXfc with X)i=i fe ® 1 3’'^^ Oi > 0, i = 1, . . . , fc is called a convex 
combination of P, and the set of all convex combinations of P is denoted by 
Conv{P). 

If P = {xi,X 2 }, then Conv{P) is called a segment (connecting xi and X 2 ), 
denoted by [xi,X 2 ]. A subset S C R'^ is called a convex set if [x,x'] C S for 
any two points x, x' G S. For a convex set S C R^*, a point x G S' is called an 
extreme point if there is no pair of points x',x" G S — x such that x G [x',x"]. 
For two extreme points xi,X 2 G S, the segment [xi,X 2 ] is called an edge of S 
if ax' -I- (1 — a)x" = x G [xi,X 2 ] for some 0 < a < 1 implies x',x" G [xi,X 2 ]. 
The intersection S of a finite number of closed half spaces is called a convex 
polyhedron, and is called a convex polytope if S is non-empty and bounded. 

Given a convex polytope S in R*^, the point-edge graph Gs = (Vs, Es) is de- 
fined to be an undirected graph with vertex set V$ corresponding to the extreme 
points of S and edge set Eg corresponding to those pairs of extreme points x, x' 
for which [x, x'] is an edge of S. For a convex polyhedron S, a hyperplane H{a,b) 
is called a supporting hyperplane of S' if iJ(a, &) fl P yf 0 and if S C p[~^{a,b) 
or S C H~{a,b). We say that a point p G S is strictly inside S if there is no 
supporting hyperplane H of S containing p. If S has a point strictly inside S in 
R^, then S is called full- dimensional in R'^. We also denote by Int{Gonv{P)) 
the set of points strictly inside Conv{P). 

Given a graph G = (V, E), an embedding of G in R^^ is a mapping f : V ^ R“^, 
where each vertex v is represented by a point f(y) G R^^, and each edge e = (it, v) 
by a segment [/(it), /(u)] (which may be written by /(e)). For two edges e, e' G E, 
segments /(e) and f{e') may cross each other. For {ui, U 2 , . . . , Up} = Y C y, we 
denote by f{Y) the set {/(ui), . . . , f{vp)} of points. For a set Y of vertices, we 
denote Gonv{f{Y)) by Convf{Y). 

We define a new kind of ‘convex embedding’ of a graph G in the d-dimensional 
space: 

Definition 4.1 Let G be a graph without isolated vertices and let G' be a sub- 
graph of G. A strictly convex embedding (or SC-embedding, for short) of G with 
boundary G' is an embedding f of G into R'^ in such a way that 

(i) [Je^E{G') /(®) sdges of a full- dimensional convex polytope S in 
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(ii) f{v) € Int{Convf{rc{v))) holds for all vertices v £ V — V(G'), 

(iii) f{v) ^ f{u) for all pairs u,v gV. 

It can be seen that the above definition implies that the extreme points of 
Convf{V) are precisely the points in f{V{G')). To satisfy (i), G' must be a 
point-edge graph of the convex polytope S. 

A similar concept of ‘convex embeddings’ of graphs, requiring only (ii) above, 
was introduced by Linial et al. in [2| and led to a new characterization of Re- 
connected graphs. Their embedding, however, is not sufficient for our purposes. 



/(I'j 





Fig. 1. An SC-embedding / of the 3-connected graph Gi. 



For a 3-connected graph Gi and a cycle G in Figure[T](a), Figure[TJb) illustra- 
tes an SC-embedding f of G with boundary C in R^, where V(G) = {ui,t> 5 ,z; 7 } 
and E{G) = {(ui,i; 5 ), (u 5 ,U 7 ), (ui,z; 7 )}. 

SC-embeddings into have the following important property (the proof is 
omitted). 

Lemma 4.2 Let G = {V,E) be a graph without isolated vertices and let f be 
an SC-embedding of G into R“*. Let f{Vi) = f{V) n H^(a,b) hold for some 
hyperplane H = H{a,b) and suppose Vi 0. Then G\Vi] is connected. □ 

5 SC-Embeddings in the Plane 

In this section, we restrict ourselves to the embeddings in the two dimensional 
space R^. We prove that every 3-connected graph G admits an SC-embedding 
/ with its boundary specified by an arbitrary cycle G, and consider how to find 
such an SC-embedding / of G. For this, we use the following characterization of 
3-connected graphs, due to Tutte. 

Lemma 5. 1 m Let G = {V,E) be a 3-connected graph. For any edge e, either 
Gje is 3-connected or G — e is a subdivision of a 3-connected graph. □ 
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For two points f(v') and f(v") in an embedding / of G, we denote by 
L{f{v'),f{v")) the half line that is obtained by extending segment [f{v'),f{v")] 
in the direction from f{v') to f{v''), and denote by L{f{v'),f{v'')) the half line 
obtained from L{f{v'),f{v")) by removing points in [f{v'),f{v'')] — /(«")• 

Let / be an SC-embedding of a graph G = (V,E), with boundary G, where 
we assume that for each v € V, the order that the positions f{u) for u G Fciv) 
appear around v is known. Let v G V — V(G), and consider another embedding 
/' such that f'{u) = f{u) for aEu GV — v but f{v) ^ f(v), and also 

f'{u) G Int{Convfi{rc{u))) for all it G F — V{G) — v. ( 5 . 1 ) 

(However f'(v) G Int{Gonvf{ro{v))) may not hold.) Clearly, the possible po- 
sition f'{v) depends only on the neighbors u of v. For each neighbor it G Fg^v), 
we define a set Gonef{v,u) C as follows. If /(it) G Int{Gonvf{FG{u) — d)), 
then let Gonef{v,u) = R^. Otherwise (if /(it) ^ Int{Gonvf{FG{u) — v))), f{u) 
becomes an apex (or on an edge) of Gonvf{FG{u) U {it} — i;), and there are two 
edges ei = (it, ici) and 62 = {u,W2) with w\,W2 G Fg{u) —v such that the acute 
angle a formed by segments /(ei) and /(e2) is the maximum (see Figure E]). 
Let Gonef{v,u) be the cone bounded by two half lines L{f(wi),f{u)) and 
L{f{w2), f{u)) (where points in these half lines are not included in Gonef{v, it)). 
We define an open set 

R*j:(y) = Gonef{v,u). 

ueFciv) 

Clearly, this is the set of the possible positions of f'{v) if /' satisfies (| 5 . 1 ll . 
Observe that R*f{v) is a non-empty open set, since Gonef{v,u) is an open set 
containing f{v) for each u G Fg{v). 

Note that Gonef{v,u) can be obtained in 0 (|/G(it)|) time since 
Int{Convf{FG{u) — u)) can be computed in 0(100(11)1) time based on the orde- 
ring that /(ic), w G Fg{u) appear around u. Summarizing the above argument, 
we have the next result. 




Fig. 2. Definition of Conef{v,u). 
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Lemma 5.2 Given an SC-embedding f of a graph G = (V,E), with boundary 
C, and a vertex v € V — V(C), R*f{v) is nonempty, and redefining f{v) to 
any point in R*f{v) preserves the property f{u) G Int{Conv f{rQ{u))) for all 
u € V — V(C) — V. Moreover R*f{v) can be obtained in 0{J2u£rG(v) I^g(u)I) 
time. □ 

Based on Lemmas O and we show the following theorem. 

Theorem 5.3 For every 3-connected graph G = {V, E) and every cycle C of G, 
there exists an SG-embedding with boundary C. Such an SG-embedding can be 
found in 0{n^) time. 

Proof. First we compute a sparse 3-connected spanning subgraph G' = {V,E') 
of G such that E' C E and \E'\ = 0{n) in linear time [H|. We put back all edges 
in E{C) to G' (if there is any missing edge in G). Clearly, an SC-embedding of 
G' is also an SC-embedding of G. 

We embed the vertices in V{C) so that the cycle G forms a convex polygon 
in the plane. An edge e = {u,v) is called internal if {u,v} fl V{C) = 0. Then 
we apply the following procedure. We choose an arbitrary internal edge e in G', 
and if G' — e is a subdivision of a 3-connected graph, then we remove the edge 
e from G' . Moreover if there exist (one or two) vertices v whose degree is two 
in G' — e, then, for each of such v, we replace the two incident edges (ui, u) and 
{v,U 2 ) with a single edge (ui,U 2 ) and remove v. The resulting graph is denoted 

by G' — e. Otherwise if G' — e is not a subdivision of a 3-connected graph, 
then we contract the edge e. By Lemma [5.11 the resulting graph in both cases 
is 3-connected. Finally, we denote the resulting graph again by G'. 

We repeat this procedure until there exists no internal edge in the current 
G'. In this case, each vertex v ^ V(C) is easily embedded (independently) so 
that f(v) G Int{Convf{rc'{v))) holds. Note that this procedure requires 0{n) 
3-connectivity tests on graphs with 0{n) edges each (due to the sparsification) . 
Thus it requires 0{n^) time altogether. 

Now we return the removed or contracted edges into G' in the reverse order 
of the above procedure. As a general step, we consider how to construct an SC- 
embedding /' of G', assuming that an SC-embedding / for a 3-connected graph 

iJ = G' — e or iL = G'/e is available, where e = (ui, U 2 ) is an internal edge in 
G'. 

Case (i): H=G'/e. 

Let V* be the vertex in G'/e created as a result of contracting the internal 
edge e = (^ 1 ,^ 2 ). To regain G', we insert e in G'/e after splitting v* into v\ 
and V 2 . To find an SC-embedding /' of G', we will determine the point /'(U 2 ) 
while keeping f'{vi) = f{v*) and f'{u) = f{u) for u G V{G') — {vi,V 2 }. Let 
Ni = Ecfivi) — V 2 and N 2 = Fc’{v 2 ) — v\. If f{v*) G Int{Convf{Ni)) in / 
(hence f'{vi) = f{v*) G Int{Convf{Ni)) in /'), then let i ?2 = Otherwise (if 
f{v*) ^ I nt{Convf{Ni))), we let R 2 = Conef{v 2 ,v{) (see Figure[3](a)) as defined 
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before Lemma [5.21 This i ?2 contains no points in /(TVi), and if f{v 2 ) G i ?2 is 
chosen, then f'{vi) (= f{v*)) € Int{Conv f/ {ro> (vi))) holds. 

Also to satisfy f{v 2 ) G Int{Convf{rG'{v 2 ))) (= Int{Convf{N 2 U {r"*}))) 
(see Figure [3](b)), we choose a point f'{v 2 ) € i ?2 H Int{Convf{N 2 U {r"*}))) 
(which is not empty, since otherwise f(v*) would be an apex of the convex hull 
Int{Convf{rG'/e{v*)))). 

Therefore by Lemma 15.21 if f'{v 2 ) is chosen from i ?2 H Int{Convf{N 2 U 
{v*}))) n then the resulting embedding /' is an SC-embedding of G'. 

Observe that R 2 r\Int{Convf{N 2 U{v*}))r\R*j:{v*) is anon-empty open set, and 
hence such a choice (satisfying f'{v 2 ) yf f'{v) for all z; G V(G') — V 2 , as well) is 
possible. 





Fig. 3. Illustration for R 2 = Conef{v 2 ,vi) and Int{Convf{N 2 U {u*})). 



Case (ii): H = G' — e. This case can be treated by a similar argument in (i) 
(the proof is omitted) . From the above argument and by Lemma 15. 2[ given an 

SC-embedding f oi G' — e (or G' je), we can construct an SC-embedding /' 
of G' in 0{\E'\){= 0{n)) time. Since this procedure is executed 0{n) times to 
construct an SC-embedding of the original graph G, the entire running time for 
finding an SC-embedding / of G is 0{v?). □ 

As shown by Theorem 15.31 the 3-connectivity is a sufficient condition for a 
graph G to have an SC-embedding. We can show that the problem of testing 
whether a general graph G admits an SC-embedding is NP-hard (the proof is 
omitted). 

Theorem 5.4 The problem of deciding whether G = (V, E) has an SC-emhedding 
is NP-hard. □ 

6 Finding a 2-Bisection in a 3-Connected Graph 

By combining the algorithmic proof of Theorem 15.31 and the ham-sandwich cut 
algorithm in two dimensions, we are now able to obtain a polynomial time al- 
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gorithm that finds a 2-bisection in a 3-connected graph. This also implies that 
Conjecture [3T] is true for k = 2. 

Algorithm BISECTION (G,Ti,T 2 ) 

Input: A 3-connected graph G and subsets Ti,T 2 C V such that \Ti\ {i = 1,2) 
are even. 

Output: A 2-bisection {Vi,V 2 } of (G,Ti,T 2 ). 

1. Choose an arbitrary cycle G, and construct an SC-embedding / with bo- 
undary G in G in O(n^) time by Theorem 15.31 

2. Let W = {/(f) I V G Ti} and B = {/(f) | v £ T 2 }, and let W = W U {a*} 
and B' = i?U{a*} by adding a dummy point a* € (hence both \ W'\ and 
\B'\ are odd integers). By applying the ham-sandwich cut algorithm | 1I4| to 
W' and B' , compute a ham-sandwich cut L such that each side of L contains 
{\W'\ — l)/2 points in W and {\B'\ — l)/2 points in B' (and hence one point 
Pi G W and one point p 2 G B' are on L). 

3. If Pi = p 2 = a*, then output the current bipartition {Vi, V 2 } of C generated 
by L. 

4. Otherwise, if Pi = P 2 a* or pi ^ p 2 (hence a* ^ {pi,P 2 }), then define 
Pi and p 2 to be on that side of L which contains a*. Output the resulting 
bipartition {!/, V 2 } of C, neglecting a*. 

Note that every point not on the boundary G is located strictly inside the 
convex hull of its neighbors in an SC-embedding / in Step 1. As mentioned 
in Section 2.3, the ham-sandwich algorithms |1I4J may perturb some points in 
an input f{V) by a small amount e. This, however, does not affect the use of 
Lemma [4.2l because such e can be chosen arbitrarily small. Therefore, by applying 
Lemma 14.2 1 to the SC-embedding in Step 3 or 4, we obtain a partition Vi and 
V 2 , where each Vj induces a connected subgraph of G. Since both Ti and T 2 
are equally divided by {Vi, V 2 }, the output {Vi, V 2 } is a 2-bisection of G with 
respect to Ti and T 2 - 

An SC-embedding / in Step 1 can be obtained in 0{v?) time by Theorem l5.3l 
Steps 2 and 3 can be done in 0(n + 7n) time using the linear time ham-sandwich 
cut algorithm [I]. From the above discussion, the next theorem is established. 

Theorem 6.1 Let G = (V,E) be a 3-connected graph. Then there exists a 2- 
bisection in G, and sueh bisection can be computed in O(n^) time. □ 

7 Remarks 

If a graph G is isomorphic to a point-edge graph of a convex polytope in R^, 
then G is called polyhedral. The following characterization of polyhedral graphs 
is well-known. 



Lemma 7. 1 m A graph G is a polyhedral graph if and only if G is 3-eonnected 
and planar. □ 
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In order to prove Coni ecture 13.11 for fc = 3 by a similar application of the 3- 
dimensional ham-sandwich theorem (see Theorem 12.11 and Lemma l4.2L it would 
be sufficient to prove that every 4-connected graph G has an SC-embedding 
/ : C — >■ R^. In such an embedding the boundary G' should be a polyhedral 
subgraph of G by Definition Id.lf il . We conjecture that such an embedding exists 
(for every proper choice of the boundary). 

Conjecture 7.2 For a 4-connected graph G = (V, E) and any polyhedral sub- 
graph G' of G, there is an SC-embedding f : V ^ with boundary G' . □ 

As opposed to 3-connected graphs, it is not clear whether every 4-connected 
graph has a subgraph which can be chosen to be the boundary of an SC- 
embedding. 

Conjecture 7.3 Every 4-connected graph G = (V,E) has a ^-connected planar 
subgraph G' = (V' , E'). □ 

Conjectures 17.21 a.nd 17.31 together would imply the existence of a 3-bisection 
in a 4-connected graph, that is, would verify Conjecture 13.11 for k = S. 
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Abstract. In this paper we consider a special case of the Maximum 
Weighted Independent Set problem for graphs: given a vertex- and edge- 
weighted tree T = {V, E) where \ V\ = n, and a real number b, determine 
the largest weighted subset P of U such that the distance between the 
two closest elements of P is at least b. We present an 0(n log® n) al- 
gorithm for this problem when the vertices have unequal weights. The 
space requirement is 0(n log n). This is the first known subquadratic al- 
gorithm for the problem. This solution leads to an 0(n log^ n) algorithm 
to the previously-studied Weighted Max-Min Dispersion Problem. 



1 Introduction 

Many problems in location theory deal with the placement of facilities on a 
network so as to maximize or minimize certain functions of distances between 
the facilities, or between facilities and the nodes of the network. One family 
of location problems deals with dispersion — the placement of sets of facilities 
within the network so as to maximize the distances among the facilities (see, 
for example, | |4|10|13|16|18Q , or the sizes of structures associated with these 
distances (such as the minimum spanning trees, Steiner trees, and Hamiltonian 
tours studied in I9])- Mathematical models for dispersion assume that the given 
network is represented by a graph G = {V, E) where each edge e G E has edge 
length d{e). For vertices x,y G V, let d{x,y) denote the shortest path between 
X and y in G. 

The following Weighted Max-Min Dispersion Problem (WMMDP) was dis- 
cussed in 13141111131 . 

Instance 

A graph G = (V, E), vertex weights lo{v) > 0 for all v gV, edge lengths 
d{e) >0 for all e G E, and a real number W > 0. 

Requirement 

Find a subset P of V such that uj{P) > W and f{P) = minvx,yGP;x 5 ^y d{x,y) is 
maximized. 



A. Aggarwal, C. Pandu Rangan (Eds.): ISAAC’99, LNCS 1741, pp. 435-|44^ 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 
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Like most other variants of dispersion problems, the WMMDP problem is 
NP-hard for general graphs |3l5j . Recently, when the underlying graph is a tree, 
Bhattacharya and Houle [T] gave a simple 0{n) algorithm for the unweighted 
version of the problem. However, the weighted version takes O(n^) time and 
space to solve. Clearly, WMMDP is polynomially solvable for network G if the 
following problem, called WMinGap, is polynomially solvable. 

Instance 

A graph G = (V, E), vertex weights w(u) > 0 for all v gV, edge lengths 
d{e) > 0 for all e € E, and a real number 6 > 0. 

Requirement 

Find a largest weighted subset P oiV such that /(P) = d{x,y) 

>b. 

In this paper we present a solution to the WMinGap problem. The proposed 
WMinGap algorithm requires O(nlog^n) time and 0(n log n) space in the worst 
case, which implies that Weighted Max-Min Dispersion Problem (WMMDP) can 
be solved in 0(n log'^n) time for an acyclic network. 

In the general setting, WMinGap and WMMDP were recently investigated by 
Rosenkrantz, Tayi and Ravi m, along with a number of related problems in 
both graph and geometric settings. Although both WMinGap and WMMDP are 
NP-hard problems, in the case where the underlying network is a linked list, 
they showed that the problems may be solved in 0{n log n) time and 0{n^ log n) 
time, respectively. A recent version of [14] indicates that both WMinGap and 
WMMDP can be solved in 0(n log n) time for the linked list case. 

In the next section, we shall present some of the terminology to be used in the 
paper. In Section [3l the algorithm for the WMinGap problem will be presented. 
Concluding remarks appear in Section [^ 

2 Notation and Preliminaries 

For our discussion of the WMinGap problem, we shall assume that the input 
T = (V,E) is a binary tree rooted at r — otherwise, T can be converted into 
a binary tree by replacing each vertex v of degree i5 > 3 by a binary tree of 
6 — 2 dummy vertices joined by edges of length zero (the particular shape of this 
‘dummy tree’ is unimportant). 

For the WMinGap problem, a (valid) placement P is a subset of V such that 
no two vertices x,y G P have distance d{x,y) < 6 in T. The weight oj{P) of a 
placement P C V is defined as the sum of the individual vertex weights w(u) 
over all v G P. If P is a subset of the subtree of T rooted at r, then we define the 
depth A{P, r) of P with respect to r to be min^gp d{v^ r), the minimum distance 
from r to the vertices of P. 

3 Weighted Min-Gap Problem 

When the vertices of T have equal weight, Bhattacharya and Houle [T] showed 
that the greedy strategy of always building upon the deepest optimal placements 
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of subtrees leads to a simple and efficient algorithm for finding the deepest opti- 
mal placement of their parent. If the vertices of T are allowed to have arbitrary 
non-negative weights, the greedy strategy fails. 

The strategy which will be used to solve the WMinGap problem is that of 
dynamic programming. Instead of maintaining a single ‘best’ choice of optimal 
placement for each subtree of T, we will instead maintain a list of candidate 
valid placements for each subtree. The list associated with the subtree rooted 
at vertex v will be constructed by merging the lists associated with its left and 
right children. 

During each merge, placements for subtrees which cannot contribute to an 
optimal placement for T must be pruned. The following lemma states a condition 
by which such suboptimal placements may be identified. 

Lemma 1. Let T he a binary tree of n vertices rooted at r such that all vertices 
and all edges are assigned a non-negative real weight, and let P he a valid pla- 
cement for T. Let V be a vertex of T , and let P'’ be the set of those vertices of 
P located in the subtree rooted at v. If Py is a valid placement for such 
that A{Py,v) > A{P'",v) and oj{Py) > uj{P'"), then P cannot he an optimal 
placement for T- 

Lemma [I] allows us to maintain a list of candidate placements for the subtree 
rooted at v in which the depths of the placements appear in increasing order, 
and the weights of the placements appear in decreasing order. The lemma implies 
that if any placement for T" cannot be inserted into the list without violating 
these orders, the placement cannot be part of an optimal placement for T, and 
shall hence be referred to as redundant. 

Lemma 2. Let P he a binary tree of n vertices rooted at r such that all verti- 
ces and all edges are assigned a non-negative real weight, and let P be a valid 
placement for T. Let v be a vertex of T, and let P" he the set of those vertices 
of P located in the subtree rooted at v. If A{P'^ ,v) > b, and if Py is a valid 
placement for T” such that A{P^,v) > A{Py,v) > b and lo{P'") < Lo{Py), then 
P cannot be an optimal placement for T. 

Any placement P^ as defined in Lemma El will also be said to be redundant. 
The lemma implies that at most one placement of depth b or greater need be 
kept for any subtree of P ; moreover, if different placements of identical weight 
exist, it suffices to keep one placement of greatest depth. 

At each vertex v of P, information regarding all possible non-redundant pla- 
cements will be stored. By maintaining the invariant that each vertex traversed 
stores a representation of all non-redundant placements, when the algorithm 
terminates, the optimal placement will be one of the non-redundant placements 
associated with r, the root of P. In addition to its weight w(f), the information 
stored at node v consists of: 

— dep('c) 

A list of the depths of candidate placements for 7^, sorted in increasing 
order. Only the last entry of dep(u) is allowed to be greater than or equal to 
b. Initially, the list is empty. 
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— wt(ti) 

A list of the weights of candidate placements for T'’ , sorted in decreasing or- 
der. The Ah entry in lists wt(r!) and dep(?;) correspond to the same candidate 
placement. Initially, the list is empty. 

pv 

A list of candidate placements where dep(?;)[i] = d(u,P'"[i]). 

In addition, the lengths of the edges of T adjacent to v are assumed to be 
accessible from v. 



3.1 Compact and Combine 

Before presenting the main algorithm, we will first discuss two of the operations 
to be performed on the lists maintained at nodes of v: Compact and Combine. 
The Compact operation involves traversing the list wt(ri), and deleting any entry 
whose successor in the list is greater than or equal to it, thus ensuring that 
wt(z;) is kept in decreasing order. It also ensures that only one entry having 
depth dep(ri)[z] > b is maintained — the entry having largest weight wt(ri)[i]. 
Whenever entry i is to be purged, it is simultaneously removed from the three 
lists wt(w), dep(ri) and P'". 

Compact can be performed in time proportional to the size of the lists. Lem- 
mas [I] and [2] justify the use of Compact on lists whenever the entries of dep(z;) 
are in non-decreasing order. 

The Combine operation is somewhat more complex, and is at the heart of 
Algorithm WMinGap. Let u and w be the left and right children of vertex v, and 
consider the lists dep(u), wt(u) and P“; and dep(w), wt{w) and P“ as described 
above. The object of Combine is to construct lists dep(z)), wt(ti) and P'" which 
reflects all non-redundant valid placements for which do not include v itself. 

Let us assume that before performing Combine, Compact has been performed 
on the lists, and that at most one entry in dep(rt) and dep(u') is greater than or 
equal to b (the last entry in each) . 

The first step of Combine involves making copies of the above lists, which 
we denote dep*(w), wt*(M) and P*; and dep*(w), wt*(w) and P^. We next add 
d{u,v) to each entry of dep*(w), and d{w,v) to each entry of dep*(ic), so that 
each list now reflect depths relative to v. 

There are three ways in which a placement P“[i] under u can be combined 
with a placement P“[j] under w to yield placements for P'’ . 

— dep*(it)[i] < I and dep*(w)[j] > 

— dep*(u)[i] > I and dep*(w)[j] < |, 

— dep*(u)[j] > I and dep*(w)[j] > 

The fourth possibility, dep*(u)[i] < | and dep*(w;)[j] < |, leads to an invalid 
placement for 7^, and is not considered. 
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The cases can handled as follows: 

1. dep*(u)[j] < I and dep*(w)[j] > 

Here, a placement P“[i] with dep*(rt)[f] < | can only be combined with 
placements P™[j] for those j such that dep*(?ii)[j] > b — dep*(u)[f]. The 
combined weight of the placement would then be wt*(u)[i] + wt*(w)[j], and 
the depth of the placement would be dep*(rt)[f]. Since the entries of wt*(ui) 
appear in decreasing order, Lemma [T] implies that of all the placements 
P™[j] satisfying dep*(t<;)[j] > b — dep*(u)[i], only the one with smallest in- 
dex j need be considered — all other choices of j lead to redundant pla- 
cements. Therefore, for each P“[i], we determine the smallest index j such 
that dep*(tc)[j] > dep*(M)[i] and dep* (■u;)[j] > 6 — dep*(M)[i]. 

2. dep*(u)[i] > I and dep*(r<;)[j] < 

In this case, a placement P™[j] with dep*(r<;)[j] < | can only be combi- 
ned with placements P“[i] for those i such that dep*(u)[f] > b — dep* (ui)[j]. 
This case is entirely symmetric to the previous case, and in a similar man- 
ner output lists can be constructed which describe all non-redundant valid 
placements resulting from combinations involving P“[j] where dep*(w)[j] 




3. dep*(u)[j] > I and dep*(w)[j] > 

In the third case, any placement P“[i] with dep*(u)[f] > | can be combined 
freely with placements P'^[j] with dep*(w)[j] > However, if P“[i] has 
depth in T’' less than or equal to that of P“'[j], then as in the first case. 
Lemma [U indicates that of all the placements P“[i] satisfying dep*(ic)[j] > 
dep*(u)[j], only the one with smallest index j need be considered — all other 
choices of j lead to redundant placements. 

It is clear from above that it is possible to combine dep(rt), wt(u) and P“ with 
dep(w), wt(w) and P“ to obtain dep(?;), wt(n) and P'" in time proportional to 
the size of the lists involved. This results in a algorithm for the WMinGap 

problem. The storage space requirement is also 0{n?). 

3.2 Modified Combine Algorithm 

We now present a modification of Combine which runs in subquadratic time. 



Intervals of Placements Let be a node of the tree, and let u and w be its 
left and right children of a vertex v. We assume that the lists dep(u), wt(w), P“, 
dep(w), wt(ui) and P™ are available. Let and be the sizes of P“ and P“, 
respectively. We are interested in obtaining the lists dep(z;), wt(ri) and P^. 

Let be the sorted sequence of the distances from v of the elements of T^. 
Let s'" be the element with rank i in S''"; that is, the distance of s" from v is 
the ith smallest in S". Let r" be the rank of dep(rt)[i] -I- d{u,v) in S". Clearly, 
r" < for each i. Let r" be the rank of dep(u>)[j] -I- d{w,v) in S" . Again, 
r" < for each such j. In general, there is no such relationship between r" 
and r". 
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Each placement P“[i] of u and P'^\j] of w can be ranked according to 
the distances from v of their elements closest to v. More precisely, the rank 
rank(P“[i]) of P“[i] is the rank in S'" of the element x S P“[z] such that 
d{v,x) = d{v,u) +dep(u)[i]. The element x shall be called the depth element 
depelt(P“[i]) of P“[z]. 

Consider now the set X of placements P™ [j] such that 

1. d(u, w) + dep(r<;)[j] < d(u, u) + dep(M)[i], and 

2. dep(w) [j] + d{v, w) + d{v, u) + dep(u) [i] > b. 

Clearly, any such placement P™[j] can be combined with P“[i]. Moreover, 
the depth of the resulting set would be determined by depelt(P™[j]). Let 
and be the minimum and the maximum ranks in S" of placements P“’[j] 
satisfying the above conditions with respect to P“ [t] . The placements of P“ with 
ranks in the interval are precisely the set X. 

Let be the interval of ranks obtained with respect to P“[/i], for any 

h > i. It is possible that the intervals and ?’(^)] overlap. Let x be 

an element which belongs to both. Since a placement of P*" with depth element 
X can be combined with both P"[i] and P“[/i], and since wt(u)[j] > wt(rt)[/i]. 
Lemma [U implies that we need not consider combining x with P“[/i]. Therefore, 
for each P“ [z] we consider only the subset C P™ such that for any x G : 

1. d{v, w) + dep(w)[j] < d{v, u) + dep(u)[z], 

2. dep(rc) [j] + d{v, w) + d{v, u) + dep(u) [z] > b, and 

3. X is not an element of any h < i. 

Redefined so that all three of these conditions are satisfied, the interval of ranks 
[^( 4 ),^p)] shall be denoted /“(z). 

The interval [^(^)> ^^)] represents the set of depth elements in T“" with which 
P“[z] can be profitably combined while keeping the depth element in P"’ ■ By 
attaching to this interval the weight of P“[z], we obtain a description of how P“[z] 
can contribute towards placements with depth elements located in P“ . This will 
be particularly useful when considering a chain of subtrees as in Figure [TJ best 
placements with depth elements in subtree can be combined with placements 
from subtrees P"' (for z > j) by accumulating such intervals in an efficient 
structure, and summing the contributions once the accumulation process has 
terminated. 



The Interval Structure We build a full binary tree BT{v) with leaf nodes 
associated with the elements of P" , as shown in Figure [U The leaves are ordered 
from left to right, with the leftmost leaf corresponding to s", and the rightmost 
leaf to . 

Each leaf s" will be assigned a weight which will accumulate a portion 

of the weight of the best placement for which s" determines the depth. The 
node s" can be either active or inactive within the structure: in the former case. 
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is positive; in the latter case uj'{i) = —oo. Initially, all leaves are set to be 
inactive. Whenever a leaf changes status from inactive to active, it receives a 
timestamp. 

At the internal nodes of BT{v), information will be stored so as to allow the 
following operations to be performed efficiently: 

1. Determine whether s’' is active. 

2. Determine the closest active leaf nodes to the left and to the right of s". 

3. Change the s" from inactive to active or vice versa. (This is done by changing 
its weight cij'(*).) 

We augment the structure so that it can also be used as a segment tree (see 
Preparata and Shamos [l2j for details). Subtree will contribute intervals 

(j) for insertion into the segment tree, for all i from 1 to nu- Each interval will 
store the weight of T”'*' [z] , and will receive a timestamp when inserted into the 
tree. At any given internal node of the segment tree, intervals will be maintained 
in increasing order of their timestamps. 

The tree will be maintained in such a way that the weight of the best place- 
ment for a depth element s” is the weight of s’" plus the sum of the weights of 
intervals containing its rank r” having timestamp greater than that of s". 



Description of the Algorithm The modified Combine algorithm, M Combine, 
processes the subtrees from T’'’ to 7""™, in order. For each subtree T'’'° , each 
placement of P’''' is considered. 

Algorithm M Combine handles placements P”'=[z] in two phases. In the first 
phase, its depth element is made active, and its weight is initialized to that 
of the best placement involving elements of U . . . U T’'*' . The tree will be 
assumed to have been compacted by the beginning of this phase, and as such 
the placement sought is simply P’''“[z] U P’''‘“’[j], where depelt(P’'''~’ [j]) is the 
shallowest depth element such that P’’*'[z] and P’''=“’[j] can be combined. The 
best placement weight when Vk itself is the depth element is also determined in 
this manner. 

After the first phase has terminated, the second phase updates the best pla- 
cements with depth elements in T’'’ U . . . , to take into account a possible 

merge with P”'' [z]. This is done by inserting the interval P''(z) into the segment 
tree, weighted with wt(ufc)[z] and timestamped. The weight of placements with 
depth element in U . . . UT"'"”’ corresponding to leaf I in BT{y) can thus be 
read by adding to its weight u)'{l) the weight uj'{I) of each interval / containing 
I and having timestamp greater than that of 1. 

In both phases, we compact BT(v). This is done by determining the true 
weights of the active leaf nodes of BT{v) immediately to the left and right of the 
node associated with each modified element. In the case of modifications which 
occurred in the first phase, let x and y be the active elements immediately to the 
left (lower rank) and the right (higher rank) of the leaf node r’’*' associated with 
P’''“[z]. It follows from Lemma 1 that, if the weight of y is greater than w'(r’’'“), 
the corresponding leaf s"*" can be eliminated from BT{v) by changing its status 
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to inactive. If the weight of x is less than lo ' we can eliminate x instead. In 
this case, we repeat the process until no more elements can be eliminated from 
BT{v). 

The case of modifications occurring in the second phase is very similar to 
that of the first phase. However, note that we need only initiate this compaction 
process from the two endpoints of interval 

The modified combine algorithm can formally be described as follows: 

Algorithm MCombine 

1. Initialization 

a) Initialize the lists P"" and dep(t!). 

b) t ^ 0. 

c) for each i = 1, 2, . . . , m do 

i. Let p ^ rank(T”'4i]). 

ii. Insert in BT{y) at the leaf node location p. 

iii. Set u}'{p) ^ wt(wi)[i] and stamp(p) ^ t; t ^ t + 1. 

2. For fc ^ 2, . . . , TO do {first phase}: 

a) For i •<— 1,2,..., determine the element Xi in BT{v) with the mini- 
mum depth in U . . . U which can be combined with P'"'“ [i] in 

such a way that the result has depth element depelt(P“'= [z]). Let yi be 
this weight. 

b) For i 1,2, . . . ,Uy^, 

i. Let p ^ rank(P’''' [i]). 

ii. Set oj'{p) ^ u)'{xi) + yi and stamp(p) •«— t; t ^ t -|- 1. 

iii. Compact BT{v) at p. 

c) As in the two previous steps, for depth element Vk itself. 

3. For fc ^ 2, . . . , TO do {second phase}: 

a) For z ■(— 1,2, .. . determine the interval I = [r~ ,r'^\ of the active 
elements in BT{v) satisfying the three conditions of Section 

b) Set uj'{I) ^ wt(t!fc)[z] and stamp(/) ^ t; t t + 1. 

c) Insert / into BT{v), and compact BT{v) at r~ and . 

After iteration k, taking into consideration only the elements of in U 
. . . UT^'”, the weight of an active point (at leaf node location I of BT{v)) can be 
determined by adding to U}'{1) the sum of over all intervals containing I 

with stamp(/) > stamp(Z). Once the algorithm has terminated, the true weights 
of the active points can be determined in this way. 
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Time and Space Complexity We know that any given interval can appear in 
at most O(logn) internal nodes of a segment tree, and that the set of intervals 
appearing at a given internal node are maintained in increasing order of their 
time stamps. It is easy to show that 

1. The structure requires 0(n log n) space. 

2. Each interval can be inserted into BT{v) in 0(log^ n) time. 

3. Each compaction operation requires 0((c+ 1) log^ n) time, where c is the 
number of leaves eliminated (rendered inactive) as a result of the operation. 

4. Given any leaf I and a timestamp t, it is possible to determine in 0(log^ n) 
time the total weight of the intervals containing I with time stamps greater 
than t. 

Noting that a leaf can be eliminated by compaction at most once, the work 
associated with each insertion or deletion requires 0(log^ n) amortized time. 
The total time required for an execution of M Combine on a chain of subtrees is 
therefore 0(nlog^ n). 



3.3 Weighted Min-Gap Algorithm 

The WMinGap algorithm for tree T with root r can now be formally described, 
as follows: 

Algorithm WMinGap 

1 . a) Determine the path from r to the centroid Vm of T. 

{The path to the centroid decomposes T into a set of subtrees 
% = 1, 2, . . . , m such that the size of each subtree is no more than two- 
thirds the size of the original tree T.} 

2. Solve the WMinGap problem for each of the subtrees T^% i = 1, . . . ,k. 

{The lists dep(«.i), wt(«.j), and the binary tree BT{ui) are known.} 

3. Apply MGombine on the solutions of Step 2 to obtain BT{y), from which 
the lists dep(ti), wt(w), and P'^ can be calculated. 

We can compute the centroid of T in 0(n) time, from which the subtrees 
t = 1, 2, . . . , A: can be determined in 0{n) time. We have seen that Step 3 
takes O(nlog^n) time. 

It is not difficult to retrieve the final solution from BT{v). The optimal 
placement is referenced by the first entries of the lists stored at r. This follows 
from the fact that all non-redundant placements for P are referenced by the 
lists. Let p be the node of T corresponding to the first active leaf of BT{v). 
Node p must be included in the optimal placement. To obtain the other vertices 
of the optimal placement, it suffices to recursively traverse the subtrees of p in P. 
These vertices are identified by the intervals encountered in the path in BT(r) 
from the root r to the leaf node containing p. We can complete this process in 
O(nlog^n) time. 

From the preceding discussion, the following theorem holds. 
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Theorem 1. Let T be a tree of n vertiees, whose vertiees and edges are assigned 
non-negative real-valued weights. Given any positive real number b, Algorithm 
WMinGap computes in O(nlog^n) worst-case time and O(nlogn) space a struc- 
ture from which in 0(n log^ n) additional time a placement P can be read which 
is optimal for P with respect to b. 

4 Conclusion 

In this paper, we have presented for acyclic networks an O(nlog^n) time and 
0(n log n) space solution to the WMinGap facility location problem. The strategy 
of eliminating redundant placements which we described is similar in many res- 
pects to certain other problems in which a dominating subsequence of potentially- 
optimal solutions is maintained (for example, the applications involving optimal 
layouts of convex shapes studied in [2] and in m- For these problems the best 
solutions known take quadratic time in the worst case. It would be interesting 
to see if the techniques presented here lead to subquadratic solutions for these 
problems as well. 
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